Breadcrumb
Beyond the Dirichlet process: heterogeneous mixtures and clustering in bioinformatics
Supervisor: Peter Green, FRS
Theme: Bayesian Modelling & Analysis
Statistical methods for clustering data are often built using mixture
models. There are several kinds, but a convenient choice within a Bayesian
setting is to use Dirichlet processes, which lead to methodology
that is tractable even for very large numbers of items. Recently, I
have devised a more flexible class of models, based on a 'stick-breaking-and-colouring' construction, which essentially allows
clusters to have different statistical properties. The project involves
further exploration of theoretical properties of this approach,
and experimentation with applying it to datasets on gene expression
and other data from high-throughput biological assay techniques.
Thus it connects with both the Statistical Bioinformatics and the Monte Carlo computation research themes.
Publications
-
Bayesian Model Based Clustering Procedures (2007)
Lau, J. W. and Green P. J.
Journal of Computational and Graphical Statistics, vol: 16, Pages: 525 - 558
URL provided by the author
