Decomposition of variance in terms of conditional means

Alessandro Figa' Talamanca,Angelo Guerriero,Alberto Leone,Gian Piero Mignoli,Enrico Rogora
DOI: https://doi.org/10.48550/arXiv.0710.0849
2007-10-04
Abstract:We test against two different sets of data an apparently new approach to the analysis of the variance of a numerical variable which depends on qualitative characters. We suggest that this approach be used to complement other existing techniques to study the interdependence of the variables involved. According to our method the variance is expressed as a sum of orthogonal components, obtained as differences of conditional means, with respect to the qualitative characters. The resulting expression for the variance depends on the ordering in which the characters are considered. We suggest an algorithm which leads to an ordering which is deemed natural. The first set of data concerns the score achieved by a population of students, on an entrance examination, based on a multiple choice test with 30 questions. In this case the qualitative characters are dyadic and correspond to correct or incorrect answer to each question. The second set of data concerns the delay in obtaining the degree for a population of graduates of Italian universities. The variance in this case is analyzed with respect to a set of seven specific qualitative characters of the population studied (gender, previous education, working condition, parent's educational level, field of study, etc.)
Applications
What problem does this paper attempt to address?