A Machine Learning Perspective on Predictive Coding with PAQ

Byron Knoll,Nando de Freitas
DOI: https://doi.org/10.48550/arXiv.1108.3298
2011-08-17
Abstract:PAQ8 is an open source lossless data compression algorithm that currently achieves the best compression rates on many benchmarks. This report presents a detailed description of PAQ8 from a statistical machine learning perspective. It shows that it is possible to understand some of the modules of PAQ8 and use this understanding to improve the method. However, intuitive statistical explanations of the behavior of other modules remain elusive. We hope the description in this report will be a starting point for discussions that will increase our understanding, lead to improvements to PAQ8, and facilitate a transfer of knowledge from PAQ8 to other machine learning methods, such a recurrent neural networks and stochastic memoizers. Finally, the report presents a broad range of new applications of PAQ to machine learning tasks including language modeling and adaptive text prediction, adaptive game playing, classification, and compression using features from the field of deep learning.
Machine Learning,Artificial Intelligence,Computer Vision and Pattern Recognition,Information Retrieval
What problem does this paper attempt to address?