Inferential Tools for Assessing Dependence Across Response Categories in Multinomial Models with Discrete Random Effects

Chiara Masci,Francesca Ieva,Anna Maria Paganoni,Masci, Chiara
DOI: https://doi.org/10.1007/s00357-024-09466-2
IF: 1.333
2024-03-05
Journal of Classification
Abstract:We propose a discrete random effects multinomial regression model to deal with estimation and inference issues in the case of categorical and hierarchical data. Random effects are assumed to follow a discrete distribution with an a priori unknown number of support points. For a K -categories response, the modelling identifies a latent structure at the highest level of grouping, where groups are clustered into subpopulations. This model does not assume the independence across random effects relative to different response categories, and this provides an improvement from the multinomial semi-parametric multilevel model previously proposed in the literature. Since the category-specific random effects arise from the same subjects, the independence assumption is seldom verified in real data. To evaluate the improvements provided by the proposed model, we reproduce simulation and case studies of the literature, highlighting the strength of the method in properly modelling the real data structure and the advantages that taking into account the data dependence structure offers.
mathematics, interdisciplinary applications,psychology, mathematical
What problem does this paper attempt to address?
### Problems the paper attempts to solve The paper aims to solve the estimation and inference problems in the multinomial model when dealing with categorical and hierarchical data. Specifically, the paper proposes a multinomial regression model with discrete random effects to improve the modeling of the dependence between random effects of different response categories. Traditional methods usually assume that the random effects between different response categories are independent, but this assumption often does not hold in real - data. Therefore, the new model proposed in the paper does not assume this independence, and thus can more accurately capture the real structure of the data. ### Main features of the model 1. **Discrete random effects**: - The model assumes that the random effects follow a discrete distribution with an a priori unknown number of support points. This is different from the traditional Gaussian distribution assumption and can better identify the underlying structure of the highest - level units. 2. **Joint multinomial semiparametric mixed - effects model (JMSPME)**: - The model takes into account the dependence relationships between multiple response categories and avoids estimation biases caused by the natural dependence between categories by jointly estimating the random effects of these categories. 3. **Expectation - maximization (EM) algorithm**: - The paper uses the EM algorithm for parameter estimation and provides a clear inference framework, including the calculation of standard errors and the evaluation of coefficient significance. 4. **Simulation studies and case analyses**: - The authors demonstrate the advantages of the new model over existing methods through simulation studies and actual case analyses. The results show that the JMSPME model performs better in terms of the accuracy and variance of parameter estimation and can more consistently identify subgroups of the highest - level units. ### Main contributions - **Improved dependence modeling**: By considering the dependence between different response categories, the accuracy and interpretability of the model are improved. - **Flexible random - effects distribution**: Using a discrete distribution instead of a Gaussian distribution makes the model more flexible and better able to adapt to the complex structure of real - data. - **Clear inference framework**: Providing methods for calculating standard errors and coefficient significance enhances the statistical inference ability of the model. ### Conclusion The JMSPME model proposed in the paper shows significant advantages when dealing with categorical and hierarchical data, especially when the dependence between different response categories needs to be considered. By improving the modeling method of random effects, this model can more accurately capture the real structure of the data and improve the reliability and interpretability of the estimates.