Abstract:Tree-based models for probability distributions are usually specified using a predetermined, data-independent collection of candidate recursive partitions of the sample space. To characterize an unknown target density in detail over the entire sample space, candidate partitions must have the capacity to expand deeply into all areas of the sample space with potential non-zero sampling probability. Such an expansive system of partitions often incurs prohibitive computational costs and makes inference prone to overfitting, especially in regions with little probability mass. Existing models typically make a compromise and rely on relatively shallow trees. This hampers one of the most desirable features of trees, their ability to characterize local features, and results in reduced statistical efficiency. Traditional wisdom suggests that this compromise is inevitable to ensure coherent likelihood-based reasoning, as a data-dependent partition system that allows deeper expansion only in regions with more observations would induce double dipping of the data and thus lead to inconsistent inference. We propose a simple strategy to restore coherency while allowing the candidate partitions to be data-dependent, using Cox's partial likelihood. This strategy parametrizes the tree-based sampling model according to the allocation of probability mass based on the observed data, and yet under appropriate specification, the resulting inference remains valid. Our partial likelihood approach is broadly applicable to existing likelihood-based methods and in particular to Bayesian inference on tree-based models. We give examples in density estimation in which the partial likelihood is endowed with existing priors on tree-based models and compare with the standard, full-likelihood approach. The results show substantial gains in estimation accuracy and computational efficiency from using the partial likelihood.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is: **How to effectively utilize data - dependent partitions in tree - based density models to improve statistical and computational efficiency while avoiding over - fitting and under - fitting problems?** Specifically, traditional tree - based models usually use predefined, data - independent candidate recursive partitions to describe probability distributions. In order to fully characterize the unknown target density in the entire sample space, these candidate partitions need to be capable enough to reach all regions where there may be non - zero sampling probabilities. However, such an expanded partition system often incurs high computational costs and is prone to over - fitting, especially in regions with small probability masses. Existing models usually compromise by using shallower trees, which limits an important characteristic of the tree - namely, the ability to characterize local features, thus leading to reduced statistical efficiency. To solve these problems, the author proposes a Cox - partial - likelihood - based method that allows candidate partitions to be data - dependent. This method parameterizes the tree - based sampling model by allocating probability masses according to the observed data, but under appropriate specifications, the inference results are still valid. Specifically, the partial - likelihood method can be widely applied to existing likelihood - based methods, especially tree - based models in Bayesian inference. ### Main Contributions 1. **Restore Consistency**: The author proposes a simple strategy to restore consistency using Cox's partial likelihood while allowing candidate partitions to be data - dependent. 2. **Improve Estimation Accuracy and Computational Efficiency**: Through the partial - likelihood method, the author shows significant improvements in density estimation, including higher estimation accuracy and more efficient computation. 3. **Applicable to Multidimensional Cases**: This method is not only applicable to one - dimensional sample spaces but can also be extended to multidimensional sample spaces where there are multiple axis - aligned partition directions. ### Mathematical Formulas - Definition of the partial - likelihood function: \[ L_P(f; x, \Omega)=\prod_{A \in T(x)} m_P(A) \] where, \[ m_P(A)=F(A_l|A)^{n(A_l)} F(A_r|A)^{n(A_r)} \] - Decomposition of the full - likelihood function: \[ L(f; x, \Omega)=\prod_{A \in I_K(x)} m_P(A)\cdot\prod_{A \in L_K(x)} L_F(f; x, A) \] ### Conclusion By introducing the partial - likelihood method, the author has successfully solved the over - fitting and under - fitting problems existing in traditional tree - based models while improving statistical and computational efficiency. This method provides new ideas and tools for Bayesian inference and other likelihood - based reasoning.

A partial likelihood approach to tree-based density modeling and its application in Bayesian inference

Empirical Likelihood Inference for Probability Density Functions under Association

Generalizing Tree Probability Estimation Via Bayesian Networks

Density Regression with Bayesian Additive Regression Trees

Maximum likelihood estimation of log-concave densities on tree space

Multivariate Density Estimation by Bayesian Sequential Partitioning

Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions

Accurate Bayesian phylogenetic point estimation using a tree distribution parameterized by clade probabilities

A nonparametric Bayesian approach to copula estimation

Conditional Density Estimation with Histogram Trees

Divide, Conquer, Combine Bayesian Decision Tree Sampling

A Variational Approach to Bayesian Phylogenetic Inference

Variational Bayesian Phylogenetic Inference with Semi-implicit Branch Length Distributions

Approximate Bayesian computation for Markovian binary trees in phylogenetics

Probability Distribution on Full Rooted Trees

Empirical-Likelihood-Based Inference for Partially Linear Models

Efficient Approximations for the Marginal Likelihood of Incomplete Data Given a Bayesian Network

A conditional density estimation partition model using logistic Gaussian processes

Improving Tree Probability Estimation with Stochastic Optimization and Variance Reduction

Density Estimation Trees in High Energy Physics

A Comparative Analysis of Methods for Probability Estimation Tree