SOCR News FSU DataImpact Symposium 2019
Contents
SOCR News & Events: Statistics, the Impact of Big Data Conference
Logistics
- Event: Statistics, the Impact of Big Data - 60th Anniversary of the FSU Department of Statistics (1959-2019)
- Website: https://ani.stat.fsu.edu/60th/
- Program
- Date: April 12-13, 2019
- Place: Augustus B. Turnbull Conference Center, located at 555 W Pensacola St, Tallahassee, FL 32301
Presenter
Title
DataSifter: Sharing of Sensitive Data via Statistical Obfuscation (PDF Slides)
Abstract
There are no practical, reliable, and effective mechanisms to share sensitive information to inspire novel methodological developments without compromising intellectual property, confidentiality, personal data. In many fields, like health, financial, intelligence, socioeconomics, high-dimensional data is prevalent and there is a profound need to develop advanced data interrogation techniques to extracting useful and actionable information the balancing the utility of the data with the risk of exposing private, personal, or secure organizational information. Excessive scrambling or encoding of the information makes it less useful for modelling, or analytical processing. Insufficient preprocessing may uncover sensitive information and introduce a substantial risk for re-identification of individuals or trade secrets by various stratification techniques. To address this problem, we developed a novel statistical method (DataSifter) that provides on-the-fly de-identification of sensitive structured and unstructured high dimensional data, such as clinical data from electronic health records (EHR). DataSifter technology enables administrative control over the balance between risk of data re-identification and preservation of the data information content. Under careful set up of user-defined privacy levels, our simulation experiments and real biomedical case-studies suggest that the DataSifter protects privacy while maintaining data utility for different types of outcomes of interest. The application of DataSifter on ABIDE data provides a realistic demonstration of how to employ the proposed algorithm on EHR with more than 500 features (DataSifter.org).
- SOCR Home page: http://www.socr.umich.edu
Translate this page: