Erin (Jiarui) Zhang
Title: Robust Bayesian Dimension Reduction with Applications to Outlier Detection
Date: August 11th, 2025
Time: 10:00am
Location: LIB 2020 & Zoom
Supervised by: Liangliang Wang & Jiguo Cao
Abstract:
The pervasive growth of high-dimensional and complex data across diverse scientific and industrial domains presents significant challenges to traditional statistical methodologies. Such complexity impedes data visualization, increases computational burden, and undermines the reliability of pattern discovery, especially in noisy or sparse settings. In light of these difficulties, advanced dimension reduction techniques have emerged as critical tools for effectively extracting actionable and meaningful insights from these challenging datasets. This thesis introduces novel robust Bayesian dimension reduction frameworks and integrates these frameworks to enhance unsupervised learning tasks such as outlier detection and cluster analysis. We first introduce a robust Bayesian functional principal component analysis framework that models functional data using skew elliptical distributions, improving the accuracy and robustness of covariance and principal component estimations in the presence of outliers. Next, we propose a generalized Bayesian multidimensional scaling (GBMDS) framework that accommodates non-Gaussian errors and various dissimilarity metrics, leading to more robust low-dimensional representations. Finally, we present a unified framework for GBMDS with model-based clustering, enabling simultaneous dimension reduction and cluster analysis. This framework is further enhanced by incorporating advanced neural network embedding models to construct semantically and contextually rich dissimilarities. Throughout this work, three specialized annealed Sequential Monte Carlo (ASMC) algorithms are developed, offering efficient computation and robust model comparison for complex Bayesian inference. These algorithms are specifically tailored to each Bayesian dimension reduction framework. Together, this thesis offers solid tools for reliable dimensionality reduction with applications to cluster analysis and outlier detection in complex, high-dimensional data contexts.
Keywords: dimension reduction; robust estimation; annealed Sequential Monte Carlo; functional data analysis; data visualization; embedding algorithms