Cola: A General Framework For Consensus Partitioning**

Cola: A General Framework For Consensus Partitioning**


Author(s): Zuguang Gu

Affiliation(s): German Cancer Research Center



Consensus partitioning is the most widely applied approach to reveal subgroups by summarizing a consensus classification from a list of individual classifications generated by repeatedly executing clustering on random subsets of the data. We implemented an R/Bioconductor package, cola, that provides a general framework for consensus partitioning. With cola, various parameters and methods can be user-defined and easily integrated into different steps of an analysis, e.g., feature selection, sample classification or defining signatures. Cola provides a new method named ATC (ability to correlate to other rows) to extract features and recommends spherical k-means clustering for subgroup classification. Additionally, cola implements a hierarchical procedure under the consensus partitioning framework, which is efficient to simultaneously identify subgroups with large and small differences, and is able to identify more subtile subgroups. The cola package is available at https://bioconductor.org/packages/cola/. The publications are available at https://doi.org/10.1093/nar/gkaa1146 and https://doi.org/10.1093/bib/bbac048.