STATISTICS COLLOQUIUM
Virtual talk
Lucy Gao
Assistant Professor
Department of Statistics
University of British Columbia
Valid inference after clustering, with application to single-cell
RNA-sequencing data
Abstract
Testing for a difference in means between two groups is fundamental to answering research questions across virtually every scientific area. Standard hypothesis tests (e.g. the t-test) control the type I error rate when the groups to be tested are defined before looking at the data. However, if the groups are instead defined by applying a clustering algorithm to the data, then applying a standard test for a difference in group means to that same data yields an extremely inflated selective type I error rate. This two-step "double-dipping" procedure is common in the analysis of single-cell RNA-sequencing data.
In my talk, I will apply ideas from selective inference to enable valid inference after hierarchical clustering. If time permits, I will also introduce count splitting: a flexible framework that enables valid inference after latent variable estimation in count-valued data, for virtually any latent variable estimation technique and inference approach.
This talk is based on joint work with Jacob Bien (University of Southern California), Daniela Witten and Anna Neufeld (University of Washington), as well as Alexis Battle and Joshua Popp (Johns Hopkins University).
Bio: Lucy is an assistant professor in the Department of Statistics at the University of British Columbia. Prior to UBC, she was an assistant professor at the University of Waterloo. .
Wednesday, November 2, 2022
4:00 pm ET, 1-hour duration
Join by meeting number | Meeting number (access code): 2621 630 5656 | Meeting password: 3siRGyXmu74 |
|
For more information, contact: Tracy Burke at tracy.burke@uconn.edu