- This event has passed.
Quantitative Issues in Cancer Research Working Group Seminar
December 6, 2023 @ 4:00 am - 5:00 pm
Rong Ma
Assistant Professor, Department of Biostatistics, Harvard T.H. Chan School of Public Health
A Spectral Approach to Assessing and Combining Multiple Data Embeddings
Abstract: Dimension reduction is an indispensable part of modern data science, and many algorithms have been developed. However, different algorithms have their own strengths and weaknesses, making it important to evaluate their relative performance, and to leverage and combine their individual strengths. This paper proposes a spectral method for assessing and combining multiple data embeddings of a given dataset produced by diverse algorithms. The proposed method provides a quantitative measure – the visualization eigenscore – of the relative performance of the embeddings for preserving the structure around each data point. It also generates a consensus embedding, having improved quality over individual visualizations in capturing the underlying structure. Our approach is flexible and works as a wrapper around any data embeddings. We analyze multiple real-world datasets to demonstrate the effectiveness of the method. We also provide theoretical justifications based on a general statistical framework, yielding several fundamental principles along with practical guidance. This is a joint work with Eric Sun and James Zou.