Abstract:Epidemiologic and medical studies often rely on evaluators to obtain measurements of exposures or outcomes for study participants, and valid estimates of associations depends on the quality of data. Even though statistical methods have been proposed to adjust for measurement errors, they often rely on unverifiable assumptions and could lead to biased estimates if those assumptions are violated. Therefore, methods for detecting potential `outlier' evaluators are needed to improve data quality during data collection stage. In this paper, we propose a two-stage algorithm to detect `outlier' evaluators whose evaluation results tend to be higher or lower than their counterparts. In the first stage, evaluators' effects are obtained by fitting a regression model. In the second stage, hypothesis tests are performed to detect `outlier' evaluators, where we consider both the power of each hypothesis test and the false discovery rate (FDR) among all tests. We conduct an extensive simulation study to evaluate the proposed method, and illustrate the method by detecting potential `outlier' audiologists in the data collection stage for the Audiology Assessment Arm of the Conservation of Hearing Study, an epidemiologic study for examining risk factors of hearing loss in the Nurses' Health Study II. Our simulation study shows that our method not only can detect true `outlier' evaluators, but also is less likely to falsely reject true `normal' evaluators. Our two-stage `outlier' detection algorithm is a flexible approach that can effectively detect `outlier' evaluators, and thus data quality can be improved during data collection stage.

Comparison of Outlier Detection Methods in NEAT Design

Outlier Detection Using t-test in Rasch IRT Equating under NEAT Design

Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating

Non-parametric Tests for the Tail Equivalence Via Empirical Likelihood

Evaluating Robust Scale Transformation Methods with Multiple Outlying Common Items under IRT True Score Equating.

Comparison of Different Response Time Outlier Exclusion Methods: A Simulation Study

A Two-Stage Approach to Differentiating Normal and Aberrant Behavior in Computer Based Testing

New Robust Scale Transformation Methods in the Presence of Outlying Common Items.

Analytical method for detecting outlier evaluators

Test Fairness: Examining Differential Functioning of the Reading Comprehension Section of the GSEEE in China

Comparison of anchor-based methods for estimating thresholds of meaningful within-patient change using simulated PROMIS PF 20a data under various joint distribution characteristic conditions

To Weight or Not to Weight? Balancing Influence of Initial Items in Adaptive Testing

On A Robust Test for Setar-Type Nonlinearity in Time Series Analysis

Higher-Order Asymptotics and Its Application to Testing the Equality of the Examinee Ability Over Two Sets of Items

A New Outlier Detection Method Considering Outliers As Model Errors

Fairness-aware Outlier Ensemble

Applying Unidimensional and Multidimensional Item Response Theory Models in Testlet-Based Reading Assessment

A Comparative Review of Methods for Comparing Means Using Partially Paired Data

Exploring the Impact of Outlier Variability on Anomaly Detection Evaluation Metrics

Comparative Study of Neighbor-based Methods for Local Outlier Detection

Hypothesis Testing for Detecting Outlier Evaluators