Robust and sparse clustering for high-dimensional data

By Sarka Brodinova, Peter Filzmoser, Thomas Ortner, Maia Zaharieva, and Christian Breiteneder

Abstract

We introduce a robust and sparse clustering procedure for high-dimensional data. The robustness aspect is addressed by a weighting function incorporated in the k-means procedure, consequently leading to an automatic weight assignment for each observation. The sparsity aspect is given by a lasso-type penalty on weighted between-cluster sum of squares. We additionally propose a framework for determining the optimal number of both clusters and variables that contribute to a cluster separation.

Reference

S. Brodinova, P. Filzmoser, T. Ortner, M. Zaharieva, C. Breiteneder: "Robust and sparse clustering for high-dimensional data"; Talk: Conference of the CLAssification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS), Milan, Italy; 09-13-2017 - 09-15-2017; in: "CLADAG 2017 Book of Short Papers", (2017), ISBN: 978-88-99459-71-0.

BibTeX

@inproceedings{TUW-261933, author = {Brodinova, Sarka and Filzmoser, Peter and Ortner, Thomas and Zaharieva, Maia and Breiteneder, Christian}, title = {Robust and sparse clustering for high-dimensional data}, booktitle = {CLADAG 2017 Book of Short Papers}, year = {2017}, isbn = {978-88-99459-71-0}, keywords = {k-means clustering, outlier detection, high-dimensional data, variable selection, parameter selection}, note = {talk: Conference of the CLAssification and Data Analysis Group (CLADAG) of the Italian Statistical Society (SIS), Milan, Italy; 2017-09-13 -- 2017-09-15} } Click into the text area and press Ctrl+A/Ctrl+C or ⌘+A/⌘+C to copy the BibTeX into your clipboard… or download the BibTeX.