Information Theory and Statistics: an overview

Daniel Commenges
DOI: https://doi.org/10.48550/arXiv.1511.00860
2015-11-03
Abstract:We give an overview of the role of information theory in statistics, and particularly in biostatistics. We recall the basic quantities in information theory; entropy, cross-entropy, conditional entropy, mutual information and Kullback-Leibler risk. Then we examine the role of information theory in estimation theory, where the log-klikelihood can be identified as being an estimator of a cross-entropy. Then the basic quantities are extended to estimators, leading to criteria for estimator selection, such as Akaike criterion and its extensions. Finally we investigate the use of these concepts in Bayesian theory; the cross-entropy of the predictive distribution can be used for model selection; a cross-validation estimator of this cross-entropy is found to be equivalent to the pseudo-Bayes factor.
Statistics Theory
What problem does this paper attempt to address?