Cohort sampling schemes for the Mantel-Haenszel estimator: Extensions to multilevel covariates, stratified models, and robust variance estimators

Larry Goldstein,Bryan Langholz
DOI: https://doi.org/10.48550/arXiv.math/0609333
2006-09-13
Abstract:In many epidemiological contexts, disease occurrences and their rates are naturally modelled by counting processes and their intensities, allowing an analysis based on martingale methods. These methods lend themselves to extensions of nested case-control sampling designs where general methods of control selection can be easily incorporated. This same methodology allows for extensions of the Mantel-Haenszel estimator in two main directions. First, a variety of new sampling designs can be incorporated which can yield substantial efficiency gains over simple random sampling. Second, the extension allows for the treatment of multiple level time dependent exposures.
Statistics Theory
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are to expand the application scope and efficiency of the Mantel - Haenszel estimator, especially in nested case - control studies. Specifically, the author aims to achieve the following goals: 1. **Introduce new sampling designs**: By introducing a variety of sampling design schemes (such as matching, reverse matching, etc.), the efficiency of the Mantel - Haenszel estimator is improved. These new sampling designs can be significantly superior to simple random sampling in some cases. 2. **Handle multi - level covariates**: Expand the Mantel - Haenszel estimator from traditional binary - level covariates to multi - level time - dependent covariates, so that it can handle more complex data structures. 3. **Robust variance estimation**: Provide robust variance estimation methods to ensure the reliability of the estimation results under different sampling designs. 4. **Asymptotic property analysis**: Prove the consistency and asymptotic normality of the proposed estimator under general conditions, and compare its efficiency with the Maximum Partial Likelihood Estimator (MPLE). ### Specific problem description - **Limitations of the traditional Mantel - Haenszel estimator**: The traditional Mantel - Haenszel estimator is mainly used for binary - level covariates and performs well in full - cohort data. However, in practical applications, obtaining full - cohort data may be impractical or too expensive, so effective sampling designs are required to reduce the amount of data. - **Requirement for new sampling designs**: In order to reduce the amount of data while maintaining the estimation accuracy, researchers need to develop new sampling design methods, such as nested case - control design, matching, and reverse matching. - **Handling of multi - level covariates**: Exposure factors in reality are often multi - level and may be time - dependent. The traditional Mantel - Haenszel estimator cannot directly handle this situation, so it needs to be extended. ### Solutions - **Introduce new sampling designs**: The paper proposes a variety of sampling design methods, such as nested case - control design, matching, and reverse matching, and proves the superiority of these methods in different situations. - **Expand the Mantel - Haenszel estimator**: Expand the Mantel - Haenszel estimator from binary - level covariates to multi - level time - dependent covariates, enabling it to handle more complex exposure factors. - **Provide robust variance estimation**: By introducing appropriate weights and adjustments, ensure that the variance estimation of the estimator is robust under different sampling designs. - **Theoretical analysis and comparison**: Through a detailed analysis of the asymptotic properties of the estimator, prove its consistency and asymptotic normality under general conditions, and compare it with the partial likelihood estimation, showing its superiority in some cases. In summary, this paper addresses the challenges of conducting epidemiological research in complex data structures and with limited data volumes by introducing new sampling designs and expanding the Mantel - Haenszel estimator.