Difference between revisions of "SOCR News ISS JSM 2025"
(→Session Logistics) |
(→Session Logistics) |
||
(12 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== [[SOCR_News | SOCR News & Events]]: 2025 Joint Statistical Meeting, Nashville, TN== | == [[SOCR_News | SOCR News & Events]]: 2025 Joint Statistical Meeting, Nashville, TN== | ||
− | The [https://ww2.amstat.org/meetings/jsm/2025/ 2025 Joint Statistical Meeting (JSM)] will feature an invited special session entitled ''Statistical Inference and AI Modeling of High-Dimensional Longitudinal Data''. | + | The [https://ww2.amstat.org/meetings/jsm/2025/ 2025 Joint Statistical Meeting (JSM)] will take place '''August 2-7, 2025''' in Nashville, TN. The annual event will feature an invited special session entitled ''Statistical Inference and AI Modeling of High-Dimensional Longitudinal Data''. |
==Session Logistics== | ==Session Logistics== | ||
− | * '''Title''': ''Statistical Inference and AI Modeling of High-Dimensional Longitudinal Data'' | + | * '''Title''': [https://ww3.aievolution.com/JSMAnnual2025/Events/viewEv?ev=1049 ''Statistical Inference and AI Modeling of High-Dimensional Longitudinal Data''] |
* '''Organizer''': [https://www.socr.umich.edu/people/dinov/ Ivo Dinov (Michigan)] | * '''Organizer''': [https://www.socr.umich.edu/people/dinov/ Ivo Dinov (Michigan)] | ||
* '''Chair''': [https://pages.stat.wisc.edu/~cmzhang/ Chunming Zhang, University of Wisconsin-Madison] | * '''Chair''': [https://pages.stat.wisc.edu/~cmzhang/ Chunming Zhang, University of Wisconsin-Madison] | ||
* '''Speakers''': [https://sites.google.com/view/changbozhu/home Changbo Zhu, University of Notre Dame], [https://medicine.iu.edu/faculty/44484/zhao-yi Yi Zhao, Indiana University], and [https://sites.google.com/umich.edu/yueyangshen/welcome Yueyang Shen (Michigan)] | * '''Speakers''': [https://sites.google.com/view/changbozhu/home Changbo Zhu, University of Notre Dame], [https://medicine.iu.edu/faculty/44484/zhao-yi Yi Zhao, Indiana University], and [https://sites.google.com/umich.edu/yueyangshen/welcome Yueyang Shen (Michigan)] | ||
* '''Discussant''': [https://www.socr.umich.edu/people/dinov/ Ivo Dinov (Michigan)], Statistics Online Computational Resource (SOCR) | * '''Discussant''': [https://www.socr.umich.edu/people/dinov/ Ivo Dinov (Michigan)], Statistics Online Computational Resource (SOCR) | ||
− | * '''Date/Time''': .. | + | * '''Date/Time''': [https://ww3.aievolution.com/JSMAnnual2025/Events/viewEv?ev=1049 Tuesday, 8/5/2025, 2:00-4:00 PM] [https://time.is/ET US ET] |
− | * '''Venue''': ... | + | * '''Venue''': [https://nashvillemusiccitycenter.com/ Music City Center], 201 Rep. John Lewis Way South, Nashville, TN 37203, [https://nashvillemusiccitycenter.com/maps-parking map and parking]. |
− | * '''Registration''': ... | + | * '''Registration''': [https://ww2.amstat.org/meetings/jsm/2025/registration.cfm 2025 JSM Registration] (starts May 1, 2025) |
− | * '''Conference''': .. | + | * '''Conference''': [https://ww2.amstat.org/meetings/jsm/2025/ 2025 Joint Statistical Meeting (JSM)] |
− | * '''Format''': | + | * '''Format''': podium-presentations (lectures) |
== Sponsors == | == Sponsors == | ||
Line 23: | Line 23: | ||
==Session Description== | ==Session Description== | ||
− | In support of the for JSM 2025 theme, "Statistics, Data Science, and Al Enriching Society," this invited special session will bring together a diverse group of academic researchers, each contributing to the cutting-edge intersections of statistical learning methods, Al applications, and high-dimensional data analysis. | + | In support of the for JSM 2025 theme, ''"Statistics, Data Science, and Al Enriching Society,"'' this invited special session will bring together a diverse group of academic researchers, each contributing to the cutting-edge intersections of statistical learning methods, Al applications, and high-dimensional data analysis. |
− | + | In this session, Changbo Zhu, Yi Zhao, and Yueyang Shen will present cutting-edge methodologies that harness the power of statistical learning, topological modeling, and deep learning to advance the analysis of complex, high-dimensional spatiotemporal data in various scientific domains. Speakers will demonstrate topological methods for handling challenges related to neuroimaging data complexity and scale and provide new perspectives on brain connectivity and function. Recent work on longitudinal and covariance regression in high-dimensional data will expose temporal structures involving a large number of covariates. The topics will cover deep learning invariance and equivariance, which characterize the robustness and generalizability of various AI models in real-world applications. All talks will emphasize challenges and opportunities in statistical and Al techniques to extract meaningful unbiased insights from complex data. This session aligns with the overarching conference theme "Statistics, Data Science, and Al Enriching Society" and will highlight applications in neuroscience, longitudinal studies, and AI forecasting, and trustworthy decision-making. | |
− | + | ||
− | + | === [https://medicine.iu.edu/faculty/44484/zhao-yi Yi Zhao] (Indiana University)=== | |
+ | |||
+ | Title: ''"Longitudinal regression of covariance matrix outcomes"'', addressing the challenges and methodologies for analyzing data with both temporal and complex covariate structures. | ||
+ | |||
+ | : In this study, a longitudinal regression model for covariance matrix outcomes is introduced. The proposal considers a multilevel generalized linear model for regressing covariance matrices on (time-varying) predictors. This model simultaneously identifies covariate-associated components from covariance matrices, estimates regression coefficients, and captures the within-subject variation in the covariance matrices. Optimal estimators are proposed for both low-dimensional and high-dimensional cases by maximizing the (approximated) hierarchical-likelihood function. These estimators are proved to be asymptotically consistent, where the proposed covariance matrix estimator is the most efficient under the low-dimensional case and achieves the uniformly minimum quadratic loss among all linear combinations of the identity matrix and the sample covariance matrix under the high-dimensional case. Through extensive simulation studies, the proposed approach achieves good performance in identifying the covariate-related components and estimating the model parameters. Applying to a longitudinal resting-state functional magnetic resonance imaging data set from the Alzheimer's Disease (AD) Neuroimaging Initiative, the proposed approach identifies brain networks that demonstrate the difference between males and females at different disease stages. The findings are in line with existing knowledge of AD and the method improves the statistical power over the analysis of cross-sectional data. | ||
+ | |||
+ | |||
+ | === [https://sites.google.com/umich.edu/yueyangshen/welcome Yueyang Shen] (University of Michigan)=== | ||
+ | |||
+ | Title: ''Probabilistic Symmetry, Variable Exchangeability, and Deep Network Learning Invariance and Equivariance''. | ||
+ | |||
+ | : This talk will first describe the mathematical-statistics framework for representing, modeling, and utilizing invariance and equivariance properties of deep neural networks. By drawing direct parallels between characterizations of invariance and equivariance principles, probabilistic symmetry, and statistical inference, we explore the foundational properties underpinning reliability in deep learning models. We examine the group-theoretic invariance in a number of deep neural networks including, multilayer perceptrons, convolutional networks, transformers, variational autoencoders, and steerable neural networks. Understanding the theoretical foundation underpinning deep neural network invariance is critical for reliable estimation of prior-predictive distributions, accurate calculations of posterior inference, and consistent AI prediction, classification, and forecasting. Two relevant data studies will be presented: one is on a theoretical physics dataset, the other is on an fMRI music dataset. Some biomedical and imaging applications are discussed at the end. | ||
+ | |||
+ | |||
+ | === [https://sites.google.com/view/changbozhu/home Changbo Zhu, University of Notre Dame] and Jane-Ling Wang, University of California-Davis=== | ||
− | In this | + | Title: ''Testing independence for sparse longitudinal data'' |
+ | |||
+ | : With the advance of science and technology, more and more data are collected in the form of functions. A fundamental question for a pair of random functions is to test whether they are independent. This problem becomes quite challenging when the random trajectories are sampled irregularly and sparsely for each subject. In other words, each random function is only sampled at a few time-points, and these time-points vary with subjects. Furthermore, the observed data may contain noise. To the best of our knowledge, there exists no consistent test in the literature to test the independence of sparsely observed functional data. We show in this work that testing pointwise independence simultaneously is feasible. The test statistics are constructed by integrating pointwise distance covariances (Székely et al., 2007) and are shown to converge, at a certain rate, to their corresponding population counterparts, which characterize the simultaneous pointwise independence of two random functions. The performance of the proposed methods is further verified by Monte Carlo simulations and analysis of real data. | ||
+ | |||
+ | == Other relevant sessions/talks== | ||
+ | |||
+ | === [https://ww3.aievolution.com/JSMAnnual2025/Events/viewEv?ev=1329 Innovations in Statistical, Machine Learning, and Deep Learning Methods for Complex Data]=== | ||
+ | |||
+ | '''Chair''': Zhengjun Zhang Chair (Chinese Academy of Sciences), Organizer: Chunming Zhang (University of Wisconsin-Madison) | ||
+ | |||
+ | '''Date''': Monday, Aug 4: 10:30 AM - 12:20 PM] | ||
+ | |||
+ | ''Sponsors'': Section on Nonparametric Statistics, International Statistical Institute, Section on Statistical Learning and Data Science | ||
+ | |||
+ | : ''Speaker'': Ivo Dinov, Statistics Online Computational Resource (Michigan) | ||
+ | : ''Title'': Complex-time Representation of Longitudinal Processes and Topological Kime-Surface Analysis | ||
+ | :: Complex-time (kime) extends the traditional representation of temporal processes into the complex plane and captures the dynamics of both classical longitudinal time and repeated-sampling process variability. Novel approaches for analyzing longitudinal data can be developed that build on the 2D parametric manifold representations of time-varying processes repeatedly observed under controlled conditions. Longitudinal processes that are typically modeled using time series are transformed into multidimensional surfaces called kime-surfaces, which jointly encode the internal dynamics of the processes as well as sampling variability. There are alternative strategies to transform classical time-courses to kime-surfaces. The spacekime framework facilitates the application of advanced topological methods, such as persistent homology, to these kime-surfaces. Topological kime-surface analysis involves studying the topological features of kime-surfaces, such as connected components, loops, and voids, which remain invariant under continuous deformations. These topological invariants can be used to classify different types of time-varying processes, detect anomalies, and uncover hidden patterns that are not apparent in traditional time-series analysis. | ||
+ | |||
+ | New AI models can be developed to predict, classify, tesselate, and forecast the behavior of high-dimensional longitudinal data, such as functional magnetic resonance imaging (fMRI), by leveraging complex-time representation of time-varying processes and topological analysis. Kime-surfaces represent mathematically-rich and computationally-tractable data objects that can be interrogated via statistical-learning and artificial intelligence techniques. Spacekime analytics has broad applicability, ranging from personalized medicine to environmental monitoring, and statistical obfuscation of sensitive information. | ||
+ | |||
+ | |||
+ | : ''Speaker'': Jian Zhang, University of Kent | ||
+ | : ''Title'': Dynamic Causal Modelling using Chen-Fliess Expansion | ||
+ | :: Dynamic causal modelling (DCM) provides a powerful framework for studying dynamics of large neural populations by using neural mass model, a set of differential equations. Although DCM has been increasingly developed into a useful clinical tool in the fields of computational psychiatry and neurology, inferring the hidden neuronal states in the model with neurophysiological data is still challenging. Many existing approaches, based on a bilinear approximation to the neural mass model, can mis-specify the model and thus compromise their accuracy. In this talk, we will introduce Chen-Fliess expansion for the neural mass model. The Chen-Fliess expansion is a type of Taylor series that converts the problem of estimating differential equations into a problem of estimating ill-posed nonlinear regression. We develop a maximum likelihood estimation based on the Chen-Fliess approximation. Both simulations and real data analysis are conducted to evaluate the proposed approach. | ||
+ | |||
+ | |||
+ | : ''Speaker'': Zhigang Yao, National University of Singapore | ||
+ | : ''Title'': Manifold Fitting: An Invitation to Data Science | ||
+ | :: Manifold fitting, which offers substantial potential for efficient and accurate modeling, poses a critical challenge in non-linear data analysis. This study presents a novel approach that employs neural networks to fit the latent manifold. Leveraging the generative adversarial framework, this method learns smooth mappings between low-dimensional latent space and high-dimensional ambient space, echoing the Riemannian exponential and logarithmic maps. The well-trained neural networks provide estimations for the latent manifold, facilitate data projection onto the manifold, and even generate data points that reside directly within the manifold. Through an extensive series of simulation studies and real data experiments, we demonstrate the effectiveness and accuracy of our approach in capturing the inherent structure of the underlying manifold within the ambient space data. Notably, our method exceeds the computational efficiency limitations of previous approaches and offers control over the dimensionality and smoothness of the resulting manifold. This advancement holds significant potential in the fields of statistics and computer science. The seamless integration of powerful neural network architectures with generative adversarial techniques unlocks new possibilities for manifold fitting, thereby enhancing data analysis. The implications of our findings span diverse applications, from dimensionality reduction and data visualization to generating authentic data. Collectively, our research paves the way for future advancements in non-linear data analysis and offers a beacon for subsequent scholarly pursuits. | ||
+ | |||
+ | : ''Speaker'': Rebecca Willett, Univ of Chicago | ||
+ | : ''Title'': Stabilizing black-box model selection with the inflated argmax | ||
+ | :: Model selection is the process of choosing from a class of candidate models given data. For instance, methods such as the LASSO and sparse identification of nonlinear dynamics (SINDy) formulate model selection as finding a sparse solution to a linear system of equations determined by training data. However, absent strong assumptions, such methods are highly unstable: if a single data point is removed from the training set, a different model may be selected. This paper presents a new approach to stabilizing model selection that leverages a combination of bagging and an "inflated" argmax operation. Our method selects a small collection of models that all fit the data, and it is stable in that, with high probability, the removal of any training point will result in a collection of selected models that overlaps with the original collection. In addition to developing theoretical guarantees, we illustrate this method in (a) a simulation in which strongly correlated covariates make standard LASSO model selection highly unstable and (b) a Lotka–Volterra model selection problem focused on identifying how competition in an ecosystem influences species' abundances. In both settings, the proposed method yields stable and compact collections of selected models, outperforming a variety of benchmarks. | ||
+ | |||
+ | === [https://ww3.aievolution.com/JSMAnnual2025/Events/viewEv?ev=4612 A Bayesian Multiplex Graph Classifier of Functional Brain Connectivity Across Cognitive Tasks]=== | ||
+ | |||
+ | : Tuesday, Aug 5: 10:30 AM - 12:20 PM | ||
+ | |||
+ | : This work seeks to investigate the impact of aging on functional connectivity across different cognitive control scenarios, particularly emphasizing the identification of brain regions significantly associated with early aging. By conceptualizing functional connectivity within each cognitive control scenario as a graph, with brain regions as nodes, the statistical challenge revolves around devising a regression framework to predict a binary scalar outcome (aging or normal) using multiple graph predictors. To address this challenge, we propose the Bayesian Multiplex Graph Classifier (BMGC). Accounting for multiplex graph topology, our method models edge coefficients at each graph layer using bilinear interactions between the latent effects associated with the two nodes connected by the edge. This approach also employs a variable selection framework on node-specific latent effects from all graph layers to identify influential nodes linked to observed outcomes. Crucially, the proposed framework is computationally efficient and quantifies the uncertainty in node identification, coefficient estimation, and binary outcome prediction. | ||
+ | |||
+ | : Presenter: Jose Rodriguez-Acosta, Texas A&M University, co authors, Sharmistha Guha, Texas A&M University, and Ivo Dinov, Statistics Online Computational Resource (Michigan) | ||
<hr> | <hr> | ||
{{translate|pageName=https://wiki.socr.umich.edu/index.php?title=SOCR_News_ISS_JSM_2025}} | {{translate|pageName=https://wiki.socr.umich.edu/index.php?title=SOCR_News_ISS_JSM_2025}} |
Latest revision as of 18:13, 2 April 2025
Contents
[hide]SOCR News & Events: 2025 Joint Statistical Meeting, Nashville, TN
The 2025 Joint Statistical Meeting (JSM) will take place August 2-7, 2025 in Nashville, TN. The annual event will feature an invited special session entitled Statistical Inference and AI Modeling of High-Dimensional Longitudinal Data.
Session Logistics
- Title: Statistical Inference and AI Modeling of High-Dimensional Longitudinal Data
- Organizer: Ivo Dinov (Michigan)
- Chair: Chunming Zhang, University of Wisconsin-Madison
- Speakers: Changbo Zhu, University of Notre Dame, Yi Zhao, Indiana University, and Yueyang Shen (Michigan)
- Discussant: Ivo Dinov (Michigan), Statistics Online Computational Resource (SOCR)
- Date/Time: Tuesday, 8/5/2025, 2:00-4:00 PM US ET
- Venue: Music City Center, 201 Rep. John Lewis Way South, Nashville, TN 37203, map and parking.
- Registration: 2025 JSM Registration (starts May 1, 2025)
- Conference: 2025 Joint Statistical Meeting (JSM)
- Format: podium-presentations (lectures)
Sponsors
- Section on Statistics in Imaging
- International Association for Statistical Computing
- International Statistical Institute
Session Description
In support of the for JSM 2025 theme, "Statistics, Data Science, and Al Enriching Society," this invited special session will bring together a diverse group of academic researchers, each contributing to the cutting-edge intersections of statistical learning methods, Al applications, and high-dimensional data analysis.
In this session, Changbo Zhu, Yi Zhao, and Yueyang Shen will present cutting-edge methodologies that harness the power of statistical learning, topological modeling, and deep learning to advance the analysis of complex, high-dimensional spatiotemporal data in various scientific domains. Speakers will demonstrate topological methods for handling challenges related to neuroimaging data complexity and scale and provide new perspectives on brain connectivity and function. Recent work on longitudinal and covariance regression in high-dimensional data will expose temporal structures involving a large number of covariates. The topics will cover deep learning invariance and equivariance, which characterize the robustness and generalizability of various AI models in real-world applications. All talks will emphasize challenges and opportunities in statistical and Al techniques to extract meaningful unbiased insights from complex data. This session aligns with the overarching conference theme "Statistics, Data Science, and Al Enriching Society" and will highlight applications in neuroscience, longitudinal studies, and AI forecasting, and trustworthy decision-making.
Yi Zhao (Indiana University)
Title: "Longitudinal regression of covariance matrix outcomes", addressing the challenges and methodologies for analyzing data with both temporal and complex covariate structures.
- In this study, a longitudinal regression model for covariance matrix outcomes is introduced. The proposal considers a multilevel generalized linear model for regressing covariance matrices on (time-varying) predictors. This model simultaneously identifies covariate-associated components from covariance matrices, estimates regression coefficients, and captures the within-subject variation in the covariance matrices. Optimal estimators are proposed for both low-dimensional and high-dimensional cases by maximizing the (approximated) hierarchical-likelihood function. These estimators are proved to be asymptotically consistent, where the proposed covariance matrix estimator is the most efficient under the low-dimensional case and achieves the uniformly minimum quadratic loss among all linear combinations of the identity matrix and the sample covariance matrix under the high-dimensional case. Through extensive simulation studies, the proposed approach achieves good performance in identifying the covariate-related components and estimating the model parameters. Applying to a longitudinal resting-state functional magnetic resonance imaging data set from the Alzheimer's Disease (AD) Neuroimaging Initiative, the proposed approach identifies brain networks that demonstrate the difference between males and females at different disease stages. The findings are in line with existing knowledge of AD and the method improves the statistical power over the analysis of cross-sectional data.
Yueyang Shen (University of Michigan)
Title: Probabilistic Symmetry, Variable Exchangeability, and Deep Network Learning Invariance and Equivariance.
- This talk will first describe the mathematical-statistics framework for representing, modeling, and utilizing invariance and equivariance properties of deep neural networks. By drawing direct parallels between characterizations of invariance and equivariance principles, probabilistic symmetry, and statistical inference, we explore the foundational properties underpinning reliability in deep learning models. We examine the group-theoretic invariance in a number of deep neural networks including, multilayer perceptrons, convolutional networks, transformers, variational autoencoders, and steerable neural networks. Understanding the theoretical foundation underpinning deep neural network invariance is critical for reliable estimation of prior-predictive distributions, accurate calculations of posterior inference, and consistent AI prediction, classification, and forecasting. Two relevant data studies will be presented: one is on a theoretical physics dataset, the other is on an fMRI music dataset. Some biomedical and imaging applications are discussed at the end.
Changbo Zhu, University of Notre Dame and Jane-Ling Wang, University of California-Davis
Title: Testing independence for sparse longitudinal data
- With the advance of science and technology, more and more data are collected in the form of functions. A fundamental question for a pair of random functions is to test whether they are independent. This problem becomes quite challenging when the random trajectories are sampled irregularly and sparsely for each subject. In other words, each random function is only sampled at a few time-points, and these time-points vary with subjects. Furthermore, the observed data may contain noise. To the best of our knowledge, there exists no consistent test in the literature to test the independence of sparsely observed functional data. We show in this work that testing pointwise independence simultaneously is feasible. The test statistics are constructed by integrating pointwise distance covariances (Székely et al., 2007) and are shown to converge, at a certain rate, to their corresponding population counterparts, which characterize the simultaneous pointwise independence of two random functions. The performance of the proposed methods is further verified by Monte Carlo simulations and analysis of real data.
Other relevant sessions/talks
Innovations in Statistical, Machine Learning, and Deep Learning Methods for Complex Data
Chair: Zhengjun Zhang Chair (Chinese Academy of Sciences), Organizer: Chunming Zhang (University of Wisconsin-Madison)
Date: Monday, Aug 4: 10:30 AM - 12:20 PM]
Sponsors: Section on Nonparametric Statistics, International Statistical Institute, Section on Statistical Learning and Data Science
- Speaker: Ivo Dinov, Statistics Online Computational Resource (Michigan)
- Title: Complex-time Representation of Longitudinal Processes and Topological Kime-Surface Analysis
- Complex-time (kime) extends the traditional representation of temporal processes into the complex plane and captures the dynamics of both classical longitudinal time and repeated-sampling process variability. Novel approaches for analyzing longitudinal data can be developed that build on the 2D parametric manifold representations of time-varying processes repeatedly observed under controlled conditions. Longitudinal processes that are typically modeled using time series are transformed into multidimensional surfaces called kime-surfaces, which jointly encode the internal dynamics of the processes as well as sampling variability. There are alternative strategies to transform classical time-courses to kime-surfaces. The spacekime framework facilitates the application of advanced topological methods, such as persistent homology, to these kime-surfaces. Topological kime-surface analysis involves studying the topological features of kime-surfaces, such as connected components, loops, and voids, which remain invariant under continuous deformations. These topological invariants can be used to classify different types of time-varying processes, detect anomalies, and uncover hidden patterns that are not apparent in traditional time-series analysis.
New AI models can be developed to predict, classify, tesselate, and forecast the behavior of high-dimensional longitudinal data, such as functional magnetic resonance imaging (fMRI), by leveraging complex-time representation of time-varying processes and topological analysis. Kime-surfaces represent mathematically-rich and computationally-tractable data objects that can be interrogated via statistical-learning and artificial intelligence techniques. Spacekime analytics has broad applicability, ranging from personalized medicine to environmental monitoring, and statistical obfuscation of sensitive information.
- Speaker: Jian Zhang, University of Kent
- Title: Dynamic Causal Modelling using Chen-Fliess Expansion
- Dynamic causal modelling (DCM) provides a powerful framework for studying dynamics of large neural populations by using neural mass model, a set of differential equations. Although DCM has been increasingly developed into a useful clinical tool in the fields of computational psychiatry and neurology, inferring the hidden neuronal states in the model with neurophysiological data is still challenging. Many existing approaches, based on a bilinear approximation to the neural mass model, can mis-specify the model and thus compromise their accuracy. In this talk, we will introduce Chen-Fliess expansion for the neural mass model. The Chen-Fliess expansion is a type of Taylor series that converts the problem of estimating differential equations into a problem of estimating ill-posed nonlinear regression. We develop a maximum likelihood estimation based on the Chen-Fliess approximation. Both simulations and real data analysis are conducted to evaluate the proposed approach.
- Speaker: Zhigang Yao, National University of Singapore
- Title: Manifold Fitting: An Invitation to Data Science
- Manifold fitting, which offers substantial potential for efficient and accurate modeling, poses a critical challenge in non-linear data analysis. This study presents a novel approach that employs neural networks to fit the latent manifold. Leveraging the generative adversarial framework, this method learns smooth mappings between low-dimensional latent space and high-dimensional ambient space, echoing the Riemannian exponential and logarithmic maps. The well-trained neural networks provide estimations for the latent manifold, facilitate data projection onto the manifold, and even generate data points that reside directly within the manifold. Through an extensive series of simulation studies and real data experiments, we demonstrate the effectiveness and accuracy of our approach in capturing the inherent structure of the underlying manifold within the ambient space data. Notably, our method exceeds the computational efficiency limitations of previous approaches and offers control over the dimensionality and smoothness of the resulting manifold. This advancement holds significant potential in the fields of statistics and computer science. The seamless integration of powerful neural network architectures with generative adversarial techniques unlocks new possibilities for manifold fitting, thereby enhancing data analysis. The implications of our findings span diverse applications, from dimensionality reduction and data visualization to generating authentic data. Collectively, our research paves the way for future advancements in non-linear data analysis and offers a beacon for subsequent scholarly pursuits.
- Speaker: Rebecca Willett, Univ of Chicago
- Title: Stabilizing black-box model selection with the inflated argmax
- Model selection is the process of choosing from a class of candidate models given data. For instance, methods such as the LASSO and sparse identification of nonlinear dynamics (SINDy) formulate model selection as finding a sparse solution to a linear system of equations determined by training data. However, absent strong assumptions, such methods are highly unstable: if a single data point is removed from the training set, a different model may be selected. This paper presents a new approach to stabilizing model selection that leverages a combination of bagging and an "inflated" argmax operation. Our method selects a small collection of models that all fit the data, and it is stable in that, with high probability, the removal of any training point will result in a collection of selected models that overlaps with the original collection. In addition to developing theoretical guarantees, we illustrate this method in (a) a simulation in which strongly correlated covariates make standard LASSO model selection highly unstable and (b) a Lotka–Volterra model selection problem focused on identifying how competition in an ecosystem influences species' abundances. In both settings, the proposed method yields stable and compact collections of selected models, outperforming a variety of benchmarks.
A Bayesian Multiplex Graph Classifier of Functional Brain Connectivity Across Cognitive Tasks
- Tuesday, Aug 5: 10:30 AM - 12:20 PM
- This work seeks to investigate the impact of aging on functional connectivity across different cognitive control scenarios, particularly emphasizing the identification of brain regions significantly associated with early aging. By conceptualizing functional connectivity within each cognitive control scenario as a graph, with brain regions as nodes, the statistical challenge revolves around devising a regression framework to predict a binary scalar outcome (aging or normal) using multiple graph predictors. To address this challenge, we propose the Bayesian Multiplex Graph Classifier (BMGC). Accounting for multiplex graph topology, our method models edge coefficients at each graph layer using bilinear interactions between the latent effects associated with the two nodes connected by the edge. This approach also employs a variable selection framework on node-specific latent effects from all graph layers to identify influential nodes linked to observed outcomes. Crucially, the proposed framework is computationally efficient and quantifies the uncertainty in node identification, coefficient estimation, and binary outcome prediction.
- Presenter: Jose Rodriguez-Acosta, Texas A&M University, co authors, Sharmistha Guha, Texas A&M University, and Ivo Dinov, Statistics Online Computational Resource (Michigan)
Translate this page: