Maximal point-polyserial correlation for non-normal random distributions

Alessandro Barbiero
DOI: https://doi.org/10.1111/bmsp.12362
2024-10-22
Abstract:We consider the problem of determining the maximum value of the point-polyserial correlation between a random variable with an assigned continuous distribution and an ordinal random variable with k $$ k $$ categories, which are assigned the first k $$ k $$ natural values 1 , 2 , … , k $$ 1,2,\dots, k $$ , and arbitrary probabilities p i $$ {p}_i $$ . For different parametric distributions, we derive a closed-form formula for the maximal point-polyserial correlation as a function of the p i $$ {p}_i $$ and of the distribution's parameters; we devise an algorithm for obtaining its maximum value numerically for any given k $$ k $$ . These maximum values and the features of the corresponding k $$ k $$ -point discrete random variables are discussed with respect to the underlying continuous distribution. Furthermore, we prove that if we do not assign the values of the ordinal random variable a priori but instead include them in the optimization problem, this latter approach is equivalent to the optimal quantization problem. In some circumstances, it leads to a significant increase in the maximum value of the point-polyserial correlation. An application to real data exemplifies the main findings. A comparison between the discretization leading to the maximum point-polyserial correlation and those obtained from optimal quantization and moment matching is sketched.
What problem does this paper attempt to address?