Abstract:While likelihood is attractive in theory, its estimates by deep generative models (DGMs) are often broken in practice, and perform poorly for out of distribution (OOD) Detection. Various recent works started to consider alternative scores and achieved better performances. However, such recipes do not come with provable guarantees, nor is it clear that their choices extract sufficient information.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: **How to effectively detect out - of - distribution (OOD) data in practical applications**. Specifically, the author points out that the traditional likelihood estimation methods based on deep generative models (DGMs) often perform poorly in practice. Especially when detecting OOD data, they may assign a relatively high likelihood value to OOD data, thus leading to misjudgment.
### Main problems:
1. **Likelihood estimation failure**: Although theoretically, likelihood is attractive, in practice, the likelihood estimation of deep generative models (such as VAE) is often biased and cannot effectively detect OOD data.
2. **Limitations of existing methods**: Although many existing improvement methods have made certain progress, they lack theoretical guarantees, and it is not clear whether these methods extract enough information for OOD detection.
### Solutions:
To address these problems, the author proposes a new principle - **Likelihood Path Principle (LPath)**, and applies it to variational auto - encoders (VAEs). The LPath principle narrows the space for searching useful statistics by introducing minimal sufficient statistics, enabling more effective detection of OOD data.
### Specific contributions:
1. **Empirical contribution**: A method for selecting OOD screening statistics is proposed. By using the structure of VAEs, state - of - the - art performance can be achieved when using simple and small - scale VAEs.
2. **Methodological contribution**: The LPath principle is proposed, which extends the classical likelihood principle to neural activation paths, providing a new perspective for representation learning.
3. **Theoretical contribution**: For the first time, the performance of VAEs in unsupervised OOD detection is quantified, and new theoretical tools (such as nearly essential support, essential distance and co - Lipschitz property) are introduced, providing strict non - asymptotic guarantees for OOD detection.
### Key concepts:
- **Likelihood path**: Refers to all neural activation paths involved in the process from input data to final likelihood estimation.
- **Minimal sufficient statistics**: Includes parameters of conditional likelihoods of encoders and decoders, such as mean and variance.
- **Nearly essential support**: Describes the main support part of the distribution, ignoring very low - probability events.
- **Essential distance**: Measures the separation degree between two distributions, considering the differences of high - probability samples.
### Summary:
By introducing the LPath principle, this paper provides a theoretically guaranteed and empirically effective OOD detection method, especially suitable for the case of using simple VAEs. This method not only performs well in practice, but also provides a solid theoretical foundation for future research.