Misclassification-Driven Sample Relabeling for Supervised Kernel Principal Component Analysis

Maciej Adamiak,

Krzysztof Ślot

Abstrakt
Abstract. Supervised kernel-Principal Component Analysis (S-kPCA) is a me thod for producing discriminative feature spaces that provide nonlinear decision regions, well-suited for handling real-world problems. The presented paper proposes a modification to the original S-kPCA concept, which is aimed at improving class-separation in resulting feature spaces. This is accomplished by identifying outliers (understood here as misclassified samples) and by an appropriate reformulation of the original S-kPCA problem. The proposed idea is to replace binary class labels that are used in the original method, by real-valued ones, derived using sample-relabeling scheme aimed at preventing potential data classification problems. The postulated concept has been tested on three standard pattern recognition datasets. It has been shown that classification performance in feature spaces derived using the introduced methodology improves by 4–16% with respect to the original S-kPCA method, depending on a dataset.
Słowa kluczowe: pattern recognition, feature extraction, kernel methods, supervised kernel PCA.
References

[1] Burges C.J., A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2 (2), pp. 121–168.
[2] Bengio Y., Learning deep architectures for ai. Foundations and Trends in Machine Learning, 2009, 2 (1), pp. 1–127.
[3] Reynolds D., Gaussian mixture models. Encyclopedia of Biometrics, 2015, pp. 827–832.
[4] Comon P., Independent component analysis: a new concept? Signal Processing, 1994, 36 (3), pp. 287–314.
[5] Bach F.R., Jordan M.I., Kernel independent component analysis. Journal of Machine Learning Research, 2002, 3, pp. 1–48.
[6] Barshan E., Ghodsi A., Azimifar Z., Jahromi M.Z., Supervised principal component analysis:  visualization, classification and regression on subspaces and submanifolds.
Pattern Recognition, 2011, 44, pp. 1357–1371.
[7] Sch¨olkopf B., Smola A., M¨uller K.R., Nonlinear component analysis as a kernel
eigenvalue problem. Neural Computation, 1998, 10, pp. 1299–1319.
[8] Hofmann T., Sch¨olkopf B., Smola A.J., Kernel methods in machine learning. The
Annals of Statistics, 2008, 36 (3), pp. 1171–1220.
35
[9] Smola A.J., Sch¨olkopf B., Learning with Kernels. MIT Press, 2002.
[10] Wang M., Sha F., Jordan M.I., Unsupervised kernel dimension reduction. Proc.of Conf. Advances in Neural Information Processing Systems, 2010, 23, pp. 2379–2387.
[11] Mika S., R¨atsch G., Scholkoph W.J., M¨uller K.R., Fisher discriminant analysis with kernels. Proc. of IEEE Conf. Neural Networks for Signal Processing, 1999, pp. 41–48.
[12] Baudat G., Anouar F., Feature vector selection and projection using kernels. Neurocomputing, 2003, 55, pp. 21–38.
[13] Song L., Smola A., Gretton A., Bedo J., Borgwardt K., Feature selection via dependence maximization. Journal of Machine Learning Research, 2012, 13, pp. 1393–1434.
[14] Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E., Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2011, 12, pp. 2825–2830.
[15] Little M.A., McSharry P.E., Roberts S.J., Costello D.A., Moroz I.M., Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection-6. BioMedical Engineering OnLine, 2011, 6 (1), pp. 23.
[16] Hungarian Institute of Cardiology. Budapest: Andras Janosi M.D., University Hospital Zurich, Switzerland: William Steinbrunn M.D., University Hospital Basel, Switzerland: Matthias Pfisterer M.D., V.A. Medical Center Long Beach and Cleveland Clinic Foundation:Robert Detrano M.D. Ph.D., Heart Disease
Data Set. [online].
[17] Moshe L., UCI machine learning repository, 2013.

[18] Chapelle O., Vapnik V., Bousquet O., Mukherjee S., Choosing multiple parameters for support vector machines. Machine Learning, 2002, 46 (1), pp. 131–159.

Czasopismo ukazuje się w sposób ciągły on-line.
Pierwotną formą czasopisma jest wersja elektroniczna.

Wersja papierowa czasopisma dostępna na www.wuj.pl