Fast Optimization of Multithreshold Entropy Linear Classifier

Rafał Józefowicz,

Wojciech Marian Czarnecki

Abstrakt

Multithreshold Entropy Linear Classifier (MELC) is a density based model which searches for a linear projection maximizing the Cauchy-Schwarz Divergence of dataset kernel density estimation. Despite its good empirical results, one of its drawbacks is the optimization speed. In this paper we analyze how one can speed it up through solving an approximate problem. We analyze two methods, both similar to the approximate solutions of the Kernel Density Estimation querying and provide adaptive schemes for selecting a crucial parameters based on user-specified acceptable error. Furthermore we show how one can exploit well known conjugate gradients and L-BFGS optimizers despite the fact that the original optimization problem should be solved on the sphere. All above methods and modifications are tested on 10 real life datasets from UCI repository to confirm their practical usability.

Słowa kluczowe: multithreshold classifier, entropy, approximation, optimization
References

Anthony M., Partitioning points by parallel planes, Discrete mathematics 282 (1), 2004, pp. 17–21.

Blake C., Merz Ch.J., {UCI} repository of machine learning databases, 1998.

Byrd R.H., Lu P., Nocedal J., Zhu C., A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific Computing 16 (5), 1995, pp. 1190–1208.

Chang C.C., Lin C.J., Libsvm: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 2011, pp. 27:1–27:27.

Czarnecki W.M., Tabor J., Multithreshold entropy linear classifier, arXiv preprint arXiv:1408.1054, 2014.

Elgammal A., Duraiswami R., Davis L.S., Efficient kernel density estimation using the fast gauss transform with applications to color modeling and tracking, IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (11), 2003, pp. 1499–1504.

Ho T. K., Kleinberg E.M., Building projectable classifiers of arbitrary complexity, Pattern Recognition, 1996., IEEE Proceedings of the 13th International Conference on, 2, 1996, pp. 880–885.

Jones E., Oliphant T., Peterson P., Scipy: Open source scientific tools for python, http://www. scipy. org/, 2001.

Principe J.C., Information theoretic learning: R´enyi’s entropy and kernel perspectives, Springer Science & Business Media, New York, USA, 2010.

Qi C., Gallivan K.A., Absil P.A. Riemannian bfgs algorithm with applications, Recent advances in optimization and its applications in engineering, Springer, 2010, pp. 183–192.

Silverman B.W., Density estimation for statistics and data analysis, Monographs on Statistics and Applied Probability 26, CRC Press, 1986.

Silverman B.W., Algorithm as 176: Kernel density estimation using the fast fourier transform, Applied Statistics, 1982, pp. 93–99.

Vapnik V., The nature of statistical learning theory, Springer, New York, USA, 2000.

Yang C., Duraiswami R., Gumerov N.A., Davis L., Improved fast gauss transform and efficient kernel density estimation, Proceedings of the Ninth IEEE International Conference on Computer Vision, 2003, pp. 664–671.

Czasopismo ukazuje się w sposób ciągły on-line.
Pierwotną formą czasopisma jest wersja elektroniczna.

Wersja papierowa czasopisma dostępna na www.wuj.pl