SOCR News ISI WSC IPS35 2019
- 1 SOCR News & Events: International Statistics Institute (ISI)
- 2 2019 World Statistics Congress (WSC)
- 3 Materials
- 3.1 Abstracts
- 3.1.1 Predictive Analytics of Big Neuroscience Data
- 3.1.2 A Generative Deep Machine Modeling Framework for Hypothesis Testing and Comparing Neuroimaging Data
- 3.1.3 Clustering two-dimensional data with binary responses via fused lasso penalty
- 3.1.4 The Default Mode Network After 20 Years: Statistical Perspectives
- 3.1.5 Implicit Bias in Big Data Analytics
- 3.2 Short Bios
- 3.1 Abstracts
SOCR News & Events: International Statistics Institute (ISI)
2019 World Statistics Congress (WSC)
Invited Paper Session (IPS35): Imaging Statistics and Predictive Data Analytics
- WSC Program
- Date: TBD (18 – 23 August 2019)
- Format: 20-min talks + 5-min Q/A
- Place: Kuala Lumpur Convention Centre (Kuala Lumpur, Malaysia)
- Proceedings: The 2019 WSC proceedings will include titles, abstracts, and papers (6 pages)
- August 1 – November 1, 2018: All presenters must submit abstracts
- December 1, 2018 - May 31, 2019: All presenters must register and pay registration fees
- January 15, 2019 – April 15, 2019: All presenters must submit papers. More information on the abstract submission process
- April 15, 2019: Presenters must submit their presentations (PPTX or PDF)
- August 18 – 23, 2019: All presenters must attend and present their papers at the congress.
Petabytes of imaging, clinical, biospecimen, genetics and phenotypic biomedical data are acquired annually. Tens-of-thousands of new methods and computational algorithms are developed and reported in the literature and thousands of software tools and data analytic services are introduced each year. This Imaging Statistics and Predictive Data Analytics session will include presentations of leading experts in biomedical imaging, computational neuroscience, and statistical learning focused on streamlining big biomedical data methodologies as well as techniques for management, aggregation, manipulation, computational modeling, and statistical inference. The session will blend innovative model-based and model-free techniques for representation, analysis and interpretation of large, heterogeneous, multi-source, incomplete and incongruent imaging and phenotypic data elements.
This session will be of interest to many theoretical statisticians and applied biomedical researchers for the following reasons:
- The digital revolution demands substantial quantitative skills, data-literacy, and analytical competence: Health science doctoral programs need to be revised and expanded to build basic-science (STEM) expertise, emphasize team-science, rely on holistic understanding of biomedical systems and health problems, and amplify dexterous abilities to handle, interrogate and interpret complex multisource information.
- The amount of newly acquired biomedical imaging data is increasing exponentially. This demands innovative statistical and computational strategies to aggregate, process, and interpret the deluge of imaging, clinical and phenotypic information.
- Trans-disciplinary training and inter-professional education is critical for ethical and collaborative research involving complex biomedical imaging and health conditions.
- Exploratory and predictive Big Data analytics is pivotally important and complementary to traditional hypothesis-driven confirmatory analyses.
Predictive Analytics of Big Neuroscience Data
This talk will present some of the Big Neuroscience Data research and education challenges and opportunities. Specifically, we will identify the core characteristics of complex neuroscience data, discuss strategies for data harmonization and aggregation, and show case-studies using large normal and pathological cohorts. Examples of methods that will be demonstrated include DataSifter (enabling secure sharing of data), compressive big data analytics (facilitating inference on multi-source heterogeneous datasets), and model-free prediction (forecasting of clinical features or derived computed phenotypes). Simulated data as well as clinical data (UK Biobank, Alzheimer’s Disease Neuroimaging Initiative, and amyotrophic lateral sclerosis case-studies) will be used for testing and validation of the techniques. In support of open-science, result reproducibility, and methodological improvements, all datasets, statistical methods, computational algorithms, and software tools are freely available online.
A Generative Deep Machine Modeling Framework for Hypothesis Testing and Comparing Neuroimaging Data
The challenge of making comparison of brain networks and multimodal brain imaging data between healthy and diseased cohorts lies in the high dimensionality of brain imaging data. To make statistically significant claims while avoiding false positives and false negatives, prohibitively large sample sizes are needed. This is the main disadvantage of the current framework for hypothesis testing. t-tests, Mann-Whitney and similar hypothesis testing methods using point statistics make model based assumptions of data that cluster around a mean value. Significance is ascertained after ascribing sufficiently improbable difference in means or variance between cohorts. Even when statistically significant differences can be ascribed, there is a lack of usable hypothesis that can grant insight into the nature of these differences. We propose a novel approach to comparing high dimensional brain imaging datasets such as brain networks. We suggest that deep learning algorithms could be applied to create generative models of the underlying dataset which is a type of hypothesis on the data from each cohort. Using different deep learning architectures, training algorithms, or different instances of trained networks, we can generate multiple hypothesis / generative models of the underlying datasets. A family of hypothesis / generative models of a given cohort dataset specifies a bound on possible hypothesis for the data. Collecting more brain datasets essentially prunes the hypothesis space and shrinks the boundaries of plausible models. We further propose that the family of generative models from different cohorts can be compared via measures of statistical dissimilarity using statistical distance metrics such as the Fisher Information metric. Generative models as hypothesis on datasets permit further interaction which allows researchers to learn the meaning of each hypothesis, thus adding value and insight to analysis.
Clustering two-dimensional data with binary responses via fused lasso penalty
LASSO is a widely used regression analysis method that yields sparse estimator. As sparse estimators leads to an automated model selection procedure, many variations of lasso have been developed to address various problem settings. Fused lasso, which is one of the variations, utilizes the sparse property to obtain locally clustered estimators. In this work, we investigate fused lasso in two-dimensional setting with binary responses. The proposed method yields an estimator which has a form of clusters of two-dimensional homogeneous blocks by capturing geographical features of predictors. This method is applicable to flexible data structures including missing observations scenario.
The Default Mode Network After 20 Years: Statistical Perspectives
In the neuroimaging literature, the default mode network (DMN) refers to a group of areas in the human cerebral cortex that consistently shows decreased activity in attention-demanding tasks and increased activity under resting-state with eyes-closed or with simple visual fixation. The discovery of DMN has boosted research interest in self-referential or intrinsic activity in the brain in both patients and healthy controls. Since 1997, related studies have mainly relied on the group-averaged responses or seed-based correlations to identify increased/decreased activity in the DMN areas. In this study, we conducted a resting-state experiment by considering the eyes-closed and eyes-open conditions, and by particularly analyzing the reproducible activity across subjects in the DMN areas (Areas 8, 9, 10, 20, 23, 24, 25, 31, 32, 39, 40 and the entorhinal cortex). The reproducible activity was estimated using the standardized intraclass-correlations (ICCs); the statistical thresholding of the ICC maps was done by considering the non-stationarity of on-going BOLD signals during the resting-state conditions. The DMN areas were parcellated according the JuBrain cytoarchitectonic atlas. Forty-nine right-handed adults (26 females, averaged age: 23.08±3.188 years) participated the resting-state task involving 4 min eyes-closed followed by 4 min eyes-open. The MRI scan was performed using a 3T MAGNETOM Skyra scanner and a standard 20-channel head-neck coil. The echo planar imaging (EPI) scans were acquired with parameters TR/TE = 2000 ms/30 ms, flip angle = 84°, 35 slices, slice thickness = 3.4 mm, FOV = 192 mm, and resolution 3x3x3.74 mm to cover the whole brain including the cerebellum. The results suggested that a variety of brain activity could be found in the DMN areas including short-term increased/decreased activity after the eyes-closed/open instructions. We suggest being cautious in using the DMN in cognitive and clinical studies.
Implicit Bias in Big Data Analytics
Ivo D. Dinov
Ivo D. Dinov is a professor of Health Behavior and Biological Sciences and Computational Medicine and Bioinformatics at the University of Michigan. He directs the Statistics Online Computational Resource and co-directs the Center for Complexity and Self-management of Chronic Disease (CSCD) and the multi-institutional Probability Distributome Project. Dr. Dinov is an Associate Director for Education and Training, of the Michigan Institute for Data Science (MIDAS). He is a member of the American Statistical Association (ASA), the International Association for Statistical Education (IASE), the American Medical Informatics Association (AMIA), as well as an Elected Member of the Institutional Statistical Institute (ISI).
Eric Tatt Wei Ho
Eric Ho Tatt Wei is a senior lecturer at Universiti Teknologi Petronas, as well as a Malaysia Node Coordinator of the International Neuroinformatics Coordinating Facility (INCF). He received his MSc and PhD degrees in Electrical Engineering from Stanford University, USA. During his PhD, he developed an automated robotic system to manipulate fruit flies for live brain imaging. Back in Malaysia, he worked on the development of microfluidic devices for monitoring and manipulating blood cells for immunotherapy in low resource settings. His current research interests are in developing novel tools at the intersection of deep learning, brain sciences and networks and he is applying these techniques to characterise the effect of addiction and aging on the brain as well as to enhance the efficacy of brain interventions to improve cognition
Yunjin Choi is an assistant professor in the Department of Statistics and Applied Probability at National University of Singapore. She has BS in Mathematics and Statistics from Seoul National University, MA in statistics at Yale University, and a PhD degree from Stanford University. Her research interests include multivariate data analysis, selective inference, and statistical learning. Dr. Choi is an expert on combining selective inference frameworks with multivariate analysis and statistical learning.
Michelle Liou received her PhD in quantitative psychology from University of Pittsburgh, and has expertise in medical statistics, functional BOLD signal and EEG oscillations with an emphasis on image/signal processing and scientific inference. She and her lab members initiated the concept of ‘reproducibility’ for bridging functional MRI techniques and scientific inference, and won the 2003 New Perspective in fMRI Research Award from fMRIDC at the Dartmouth College, USA. She also won the Outstanding Research Award from the National Science Council (Taiwan) in 1999 and 2003. She is currently a senior research fellow at the Institute of Statistical Science, Academia Sinica, and visiting professor in the Translational Imaging Research Center, Taipei Medical University. In the past ten years, Dr. Liou has been invited to give talks on the importance of ‘reproducibility’ in brain research in China, Korea, Russia, Singapore, Taiwan, and USA.
S. Ejaz Ahmed
Prof. Ahmed is Dean of the Brock University School of Mathematics and Statistics. Prior to that, he was Head of Mathematics at the University of Windsor and University of Regina. He is a fellow of American Statistical Association and an Elected Member of the International Statistical Institute. Dr. Ahmed is an expert in statistical inference, shrinkage estimation, and asymptotic theory. He serves on the editorial boards of many statistical journals and served as a Board of Director and Chairman of the Education Committee of the Statistical Society of Canada. Dr. Ahmed is a member of an Evaluation Group, Discovery Grants and the Grant Selection Committee, Natural Sciences and Engineering Research Council of Canada (NSERC).
- SOCR Home page: http://www.socr.umich.edu
Translate this page: