Statistics Colloquium
November 8 (Fri) 11 a.m.
Alavi Commons Room, 6625 Everett Tower

A Novel Ensemble Technique using Rank Aggregation for Mining Complex High Dimensional Data

Susmita Datta
President of Caucus for Women in Statistics, 2013
Professor, Graduate Director and Distinguished University Scholar
Department of Bioinformatics and Biostatistics
University of Louisville

Data mining techniques such as clustering and classification are extremely important in analyzing high dimensional and high throughput biomedical data such as microarray mRNA/miRNA and mass spectrometry data. However, due to the availability of many clustering and classification algorithms the obtained results are not the same across all such techniques used to analyze the same data. This poses a problem to the scientific world. Hence, we have developed an ensemble method to unify the results of all such data mining algorithms. This ensemble technique is formed by a stochastic optimization, rank aggregation technique using a Monte Carlo cross-entropy algorithm. We show that this ensemble technique can be used with all possible clustering and classification algorithms. We provide a brief description of the open source R based software and provide analysis of some real life microarray data to demonstrate our method.

All statistics students are expected to attend.


Past colloquiums


Department of Statistics
3304 Everett Tower
Western Michigan University
Kalamazoo MI 49008-5152 USA
(269) 387-1420 | (269) 387-1419 Fax