Validating clustering for gene expression data bioinformatics
Motivation: In recent years, there have been various efforts to overcome the limitations of standard clustering approaches for the analysis of gene expression data by grouping genes and samples simultaneously.
Accordingly, no guidelines concerning the choice of the biclustering method are currently available.
In the literature, there are several comparative studies on traditional clustering techniques (Yeung ., 2001; Azuaje, 2002; Datta and Datta, 2003); however, for biclustering no such extensive empirical comparisons exist as pointed out by (Madeira and Oliveira (2004).Among other things, we formulate three reasonable validation strategies that can be used with any clustering algorithm when temporal observations or replications are present.We evaluate each of these six clustering methods with these validation measures.Results: In this paper, we consider six clustering algorithms (of various flavors!) and evaluate their performances on a well-known publicly available microarray data set on sporulation of budding yeast and on two simulated data sets.