SOCR News UMich SPH MLEED 2019
Contents
SOCR News & Events: SOCR DataSifter: A Statistical Obfuscation Technique enabling Effective Data Sharing
Logistics
- Seminar Series: University of Michigan SPH Environmental Epidemiology Seminar Series
- Date/Times: Tuesday, Dec. 3, 2019, 12 PM (Noon).
- Place/Time: 3755 SPH I (School of Public Health, 1415 Washington Heights)
Presentation
- Presenter: Ivo D. Dinov, joint work with Nina Zhou, Simeone Marino, Yi Zhao, Lu Wei, Lu Wang.
- Title: SOCR DataSifter: A Statistical Obfuscation Technique enabling Effective Data Sharing
- Abstract: Effective and pragmatic sharing of data that includes sensitive information is difficult. The validation and reproducibility of findings in many health, financial, intelligence, socioeconomic, and other high-dimensional case-studies is inhibited when the data can’t be shared and the results independently confirmed. Either the utility of the data may be compromised by significant masking of the data or alternatively there may be a high risk of exposing private personal or secure organizational information. Excessive scrambling or encoding of the information makes the information less useful for modeling, or analytical processing. Insufficient preprocessing may compromise sensitive information and introduce a substantial risk for re-identification of individuals by various stratification techniques.
- To address this problem, the SOCR lab developed a novel statistical method (DataSifter) that provides on-the-fly obfuscation of high-dimensional structured and unstructured sensitive data, e.g., clinical data from electronic health records (EHR). This technique provides complete administrative control over the balance between risk of data re-identification and preservation of the data information. Under careful set up of user-defined privacy levels, our simulation experiments suggest that the DataSifter protects privacy while maintaining data utility for different types of outcomes of interest. The application of DataSifter on ABIDE data provides a realistic demonstration of how to employ the proposed algorithm on EHR with more than 500 features. We are extending the DataSifter to desensitize longitudinal data and free-text. Time-permitting, some additional SOCR tools and resources may be demonstrated (http://www.socr.umich.edu).
Demos
- General SOCR Webapps
- SOCR BrainViewer
- Hands-on interactive visualization of extremely high-dimensional data (learning module and webapp)
References
- SOCR is sponsored in part by NIH Grants P30 DK089503, P20 NR015331, U54 EB020406, P50 NS091856, P30AG053760, as well as, NSF Grants 1734853 and 1636840.
- DataSifter: http://DataSifter.org
- US Patent US20190042791
- SOCR Home page: http://www.socr.umich.edu
Translate this page: