Uniform Cross-entropy Clustering

Maciej Brzeski,

Przemysław Spurek

Robust mixture models approaches, which use non-normal distributions have recently been upgraded to accommodate data with fixed bounds. In this article we propose a new method based on uniform distributions and Cross-Entropy Clustering (CEC). We combine a simple density model with a clustering
method which allows to treat groups separately and estimate parameters in each cluster individually. Consequently, we introduce an effective clustering algorithm which deals with non-normal data.
Słowa kluczowe: Clustering, Cross-entropy, Uniform distribution

[1] Jain A., Data clustering: 50 years beyond K-means. Pattern Recognition Letters,
2010, 31 (8), pp. 651–666.
[2] Levin M.S., Combinatorial clustering: Literature review, methods, examples.
Journal of Communications Technology and Electronics, 2015, 60 (12), pp. 1403–
[3] McLachlan G., Krishnan T., The EM algorithm and extensions. vol. 382. John
Wiley & Sons, 2007.
[4] McLachlan G., Peel D., Finite mixture models. John Wiley & Sons, 2004.
[5] Tabor J., Misztal K., Detection of elliptical shapes via cross-entropy clustering.
In: Pattern Recognition and Image Analysis. vol. 7887, Jun 2013, pp. 656–663.
[6] Elguebaly T., Bouguila N., Background subtraction using finite mixtures of asymmetric
gaussian distributions and shadow detection. Machine Vision and Applications,
2014, 25 (5), pp. 1145–1162.
[7] Spurek P., General split gaussian cross–entropy clustering. Expert Systems with
Applications, 2017, 68, pp. 58–68.
[8] Lee S.X., McLachlan G.J., Finite mixtures of canonical fundamental skew
t-distributions. Statistics and Computing, 2015, pp. 1–17.
[9] Lin T.I., Ho H.J., Lee C.R., Flexible mixture modelling using the multivariate
skew-t-normal distribution. Statistics and Computing, 2014, 24 (4), pp. 531–546.
[10] Vrbik I., McNicholas P., Analytic calculations for the em algorithm for multivariate
skew-t mixture models. Statistics & Probability Letters, 2012, 82 (6), pp.
[11] Browne R.P., McNicholas P.D., A mixture of generalized hyperbolic distributions.
Canadian Journal of Statistics, 2015.
[12] ´Smieja M., Wiercioch M., Constrained clustering with a complex cluster structure.
Advances in Data Analysis and Classification, pp. 1–26.
[13] Spurek P., Tabor J., Byrski K., Active function cross-entropy clustering. Expert
Systems with Applications, 2017, 72, pp. 49–66.
[14] Banfield J.D., Raftery A.E., Model-based gaussian and non-gaussian clustering.
Biometrics, 1993, pp. 803–821.
[15] Jirsa L., Pavelkov´a L., Estimation of uniform static regression model with
abruptly varying parameters. In: Informatics in Control, Automation and
Robotics (ICINCO), 2015 12th International Conference on. 1, IEEE, 2015, pp.
[16] Pavelkov´a L., K´arn`y M., State and parameter estimation of state-space model
with entry-wise correlated uniform noise. International Journal of Adaptive Control
and Signal Processing, 2014, 28 (11), pp. 1189–1205.
[17] Nagy I., Suzdaleva E., Mlyn´arov´a, T., Mixture-based clustering non-gaussian
data with fixed bounds. In: Proceedings of the IEEE International conference
Intelligent systems IS. 16, 2016, pp. 4–6.
[18] Tabor J., Spurek P., Cross-entropy clustering. Pattern Recognition, 2014, 47(9),
pp. 3046–3059.
[19] Casella G., Berger R.L., Statistical inference. 2. Duxbury Pacific Grove, CA,
[20] Hartigan, J.A., Clustering algorithms, 1975.
[21] ´Smieja M., Tabor J., Spherical wards clustering and generalized voronoi diagrams.
In: Data Science and Advanced Analytics (DSAA), 2015. 36678 2015. IEEE
International Conference on, IEEE, 2015, pp. 1–10.
[22] Telgarsky M., Vattani A., Hartigan’s method: k-means clustering without
voronoi. In: AISTATS, 2010, pp. 820–827.
[23] Bentley J.L., Multidimensional binary search trees used for associative searching.
Communications of the ACM, 1975, 18 (9), pp. 509–517.

Czasopismo ukazuje się w sposób ciągły on-line.
Pierwotną formą czasopisma jest wersja elektroniczna.

Wersja papierowa czasopisma dostępna na www.wuj.pl