Markov State Space Aggregation via the Information Bottleneck Method
Abstrakt
Consider the problem of approximating a Markov chain by another Markov chain with a smaller state space that is obtained by partitioning the original state space. An information-theoretic cost function is proposed that is based on the relative entropy rate between the original Markov chain and a Markov chain defined by the partition. The state space aggregation problem can be sub-optimally solved by using the information bottleneck method.
PEŁNY TEKST:
PDFReferences
Cover T.M., Thomas J.A., Elements of Information Theory, Wiley Interscience, Hoboken, NJ, 2nd edition, 2006.
Deng K., Mehta P.G., Meyn S.P., Optimal Kullback-Leibler aggregation via spectral theory of Markov chains, IEEE Trans. Autom. Control 56 (12), Dec. 2011, pp. 2793–2808.
Dhillon I., Mallela S., Modha D., Information-theoretic co-clusterings, Proc. ACM Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD), Washington, D.C., Aug. 2003, pp. 89–98.
Friedman A., Goldberger J., Information theoretic pairwise clustering, E. Hancock and M. Pelillo (Eds.), Proc. Similarity-Based Pattern Recognition, Springer Berlin LNCS 7953, 2013, pp. 106–119.
Geiger B.C., Petrov T., Kubin G., Koeppl H., Optimal Kullback-Leibler aggregation via information bottleneck, Apr. 2013, Accepted for publication in IEEE Trans. Autom. Control; preprint available: arXiv:1304.6603 [cs.SY].
Geiger B.C., Temmel C., Lumpings of Markov chains, entropy rate preservation, and higher-order lumpability, Dec. 2012. Accepted for publication in J. Appl. Prob., preprint available: arXiv:1212.4375 [cs.IT].
Goldberger J., Erez K., Abeles M., A Markov clustering method for analyzing movement trajectories, Proc. IEEE Int. Workshop on Machine Learning for Signal Processing (MLSP), Thessaloniki, Aug. 2007, pp. 211–216.
Gray R.M., Entropy and Information Theory, Springer, New York, NY, 1990.
Gurvits L., Ledoux J., Markov property for a function of a Markov chain: A linear algebra approach, Linear Algebra Appl. 404, 2005, pp. 85–117.
Hayes B., First links in the Markov chain, American Scientist 101, 2013,
Katsoulakis M.A., Trashorras J., Information loss in coarse-graining of stochastic particle dynamics, J. Stat. Phys., 122 (1), 2006, pp. 115–135.
Kemeny J.G., Snell J.L., Finite Markov Chains, Springer, 2nd edition, 1976.
Manning C.D., Sch¨utze H., Foundations of Statistical Natural Language Processing, MIT Press, Cambridge, MA, 2nd edition, 2000.
Petrov T., Formal reductions of stochastic rule-based models of biochemical systems, PhD thesis, ETH Z¨urich, 2013.
Rached Z., Alajaji F., Campbell L.L., The Kullback-Leibler divergence rate between
Markov sources, IEEE Trans. Inf. Theory 50 (5), May 2004, pp. 917–921.
Raj A., Wiggins C.H., An information-theoretic derivation of min-cut-based clustering, IEEE Trans. Pattern Anal. Mach. Intell. 32 (6), June 2010, pp. 988–995.
Shannon C.E., A mathematical theory of communication, Bell Systems Technical Journal 27, Oct. 1948, pp. 379–423, 623–656.
Slonim N., Tishby N., Agglomerative information bottleneck, Advances in Neural Information Processing Systems (NIPS), Denver, CO, Nov. 1999, pp. 617–623.
Tishby N., Pereira F.C., Bialek W., The information bottleneck method, Proc. Allerton Conf. on Communication, Control, and Computing, Monticello, IL, Sept. 1999, pp. 368–377.
Tishby N., Slonim N., Data clustering by Markovian relaxation and the information bottleneck method, Advances in Neural Information Processing Systems (NIPS), Denver, CO, Nov. 2000.
Tzortzis I., Charalambous C.D., Charalambous T., Hadjicostis C.N., Johansson M., Approximation of Markov processes by lower dimensional processes via total variation metrics, Oct. 2014, arXiv:1410.3976 [math.OC].
Vedaldi A., Fulkerson B. VLFeat: An open and portable library of computer vision algorithms, 2008, http://www.vlfeat.org/.
Vidyasagar M., Reduced-order modeling of Markov and hidden Markov processes via aggregation, Proc. IEEE Conf. on Decision and Control (CDC), Atlanta, GA, Dec. 2010, pp. 1810–1815.
White L.B., Mahony R., Brushe G.D., Lumpable hidden Markov models–model reduction and reduced complexity filtering, IEEE Trans. Autom. Control 45 (12), Dec. 2000, pp. 2297–2306.