Uniform Cross-entropy Clustering

Maciej Brzeski,

Przemysław Spurek

Robust mixture models approaches, which use non-normal distributions have recently been upgraded to accommodate data with fixed bounds. In this article we propose a new method based on uniform distributions and Cross-Entropy Clustering (CEC). We combine a simple density model with a clustering method which allows to treat groups separately and estimate parameters in each cluster individually. Consequently, we introduce an effective clustering algorithm which deals with non-normal data.
Słowa kluczowe: Clustering, Cross-entropy, Uniform distribution

[1] Jain A., Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 2010, 31 (8), pp. 651–666.

[2] Levin M.S., Combinatorial clustering: Literature review, methods, examples. Journal of Communications Technology and Electronics, 2015, 60 (12), pp. 1403–1428.

[3] McLachlan G., Krishnan T., The EM algorithm and extensions. vol. 382. John Wiley & Sons, 2007.

[4] McLachlan G., Peel D., Finite mixture models. John Wiley & Sons, 2004.

[5] Tabor J., Misztal K., Detection of elliptical shapes via cross-entropy clustering. In: Pattern Recognition and Image Analysis. vol. 7887, Jun 2013, pp. 656–663.

[6] Elguebaly T., Bouguila N., Background subtraction using finite mixtures of asymmetric gaussian distributions and shadow detection. Machine Vision and Applications, 2014, 25 (5), pp. 1145–1162.

[7] Spurek P., General split gaussian cross–entropy clustering. Expert Systems with Applications, 2017, 68, pp. 58–68.

[8] Lee S.X., McLachlan G.J., Finite mixtures of canonical fundamental skew t-distributions. Statistics and Computing, 2015, pp. 1–17.

[9] Lin T.I., Ho H.J., Lee C.R., Flexible mixture modelling using the multivariate skew-t-normal distribution. Statistics and Computing, 2014, 24 (4), pp. 531–546.

[10] Vrbik I., McNicholas P., Analytic calculations for the em algorithm for multivariate skew-t mixture models. Statistics & Probability Letters, 2012, 82 (6), pp. 1169–1174.

[11] Browne R.P., McNicholas P.D., A mixture of generalized hyperbolic distributions. Canadian Journal of Statistics, 2015.

[12] ´Smieja M., Wiercioch M., Constrained clustering with a complex cluster structure. Advances in Data Analysis and Classification, pp. 1–26.

[13] Spurek P., Tabor J., Byrski K., Active function cross-entropy clustering. Expert Systems with Applications, 2017, 72, pp. 49–66.

[14] Banfield J.D., Raftery A.E., Model-based gaussian and non-gaussian clustering. Biometrics, 1993, pp. 803–821.

[15] Jirsa L., Pavelkov´a L., Estimation of uniform static regression model with abruptly varying parameters. In: Informatics in Control, Automation and Robotics (ICINCO), 2015 12th International Conference on. 1, IEEE, 2015, pp. 603–607.

[16] Pavelkov´a L., K´arn`y M., State and parameter estimation of state-space model with entry-wise correlated uniform noise. International Journal of Adaptive Control and Signal Processing, 2014, 28 (11), pp. 1189–1205.

[17] Nagy I., Suzdaleva E., Mlyn´arov´a, T., Mixture-based clustering non-gaussian data with fixed bounds. In: Proceedings of the IEEE International conference Intelligent systems IS. 16, 2016, pp. 4–6.

[18] Tabor J., Spurek P., Cross-entropy clustering. Pattern Recognition, 2014, 47(9), pp. 3046–3059.

[19] Casella G., Berger R.L., Statistical inference. 2. Duxbury Pacific Grove, CA, 2002.

[20] Hartigan, J.A., Clustering algorithms, 1975.

[21] ´Smieja M., Tabor J., Spherical wards clustering and generalized voronoi diagrams. In: Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE International Conference on, IEEE, 2015, pp. 1–10.

[22] Telgarsky M., Vattani A., Hartigan’s method: k-means clustering without voronoi. In: AISTATS, 2010, pp. 820–827.

23] Bentley J.L., Multidimensional binary search trees used for associative searching. Communications of the ACM, 1975, 18 (9), pp. 509–517.