Abstract:Introduction The National Health and Nutrition Examination Survey (NHANES) is a periodic survey conducted by the National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention. The NHANES is designed to provide national estimates of the health and nutritional status of the civilian noninstitutionalizedpopulation. Sociodemographic and medical history information are obtained through household interviews, while physical measurements, physiological tests, and biochemical measurements are collected through standardized physical examinations in mobile examination centers (MECs). The on-going Third NHANES or NHANES 111 is the seventh of an extensive series of periodic health and nutrition surveys that NCHS has conducted since 1960. The current NHANES HI, with a sample of approximately 40,000 sample persons 2 months of age and older, has been divided into two 3-year national samples. Phase 1 was conducted from October 1988 to October 1991 while Phase 2 will continue until October 1994. NHANES 111 is based on a complex, multistage area probability sample design and includes an oversample of children under 5 years of age, older Americans aged 60+ years, and both black and Mexican-American persons. Details of the sample design of NHANES 111 have been previously published (1). NHANES 111, like most sample surveys, experiences both total (unit) nonresponse and item nonresponse. The missing data problem for NHANES III is somewhat unique since sample persons can refuse to participate at three different stages of the data collection. Unit nonresponse rates for NHANES HIPhase 1 ranged from 0% for the screening interview (with about 7% of the screening data obtained from neighbors) to 14 % for the household interview to 22 % for the physical examination. It is common survey practice to compensate for unit nonresponse through weighting class adjustments (2-5). The adjustments to reduce potential nonresponse bias for NHANES IIIPhase 1 have been previously described (6). In addition to unit nonresponse, various levels of item nonresponse occur in NHANES HI. In Phase 1, item nonresponse of 1-5% occurred for the household interview questions. In addition, some components of the physical examination were not successfully completed for all sample persons. Furthermore, some examination components include a number of individual measurements (e.g., body measurements)--some of which may be missing. Item nonresponse rates for the individual components ranged from 5-8 %. Generally, item nonresponse is handled by some type of imputation. Imputation methods fill in missing items with values from similar units in the dataset or with predicted values obtained from a model, thus making it possible to analyze the data as if it were complete. Some common methods of imputation used in surveys include deductive imputation, mean imputation, Hot Deck imputation, Cold Deck imputation, regression imputation, stochastic regression, multiple imputation, and composite imputation methods (7). Each of these imputation methods has relative advantages and disadvantages. The method of choice for a survey may depend upon particular circumstances including the type of survey data and availability of computer hardware and software. In addition to allowing complete data methods of analysis, multiple imputation allows one to assess the impact of missing data uncertainty on the variances and to revise estimates of variance to reflect the additional uncertainty (8). In previous NHANES surveys, imputation for item nonresponse was done on an ad hoc basis. The purpose of this paper is to describe research conducted to compare alternative missing data adjustment methods for selected survey components in NHANES 111Phase 1 based on single and multiple imputation methodology. The information contained in this paper, in part, is based on a special project carded out during 1992 and contained in a f'mal report by Datametrics Research, Inc. (9).

nhanesA: achieving transparency and reproducibility in NHANES research

Update on NHANES Dietary Data: Focus on Collection, Release, Analytical Considerations, and Uses to Inform Public Policy

Workshop summary: building an NHANES for the future

Organizing and Analyzing the Activity Data in NHANES

NHANES-GCP: Leveraging the Google Cloud Platform and BigQuery ML for reproducible machine learning with data from the National Health and Nutrition Examination Survey

The NHANES III Multiple Imputation Project

Data Resource Profile: The Korea National Health and Nutrition Examination Survey (KNHANES)

Reproducible Research: A Retrospective

Exploratory Electronic Health Record Analysis with Ehrapy

U.S. weight trends: a longitudinal analysis of an NIH-partnered dataset

Transparency and reproducibility in the Adolescent Brain Cognitive Development (ABCD) study

Korea National Health and Nutrition Examination Survey, 20th anniversary: accomplishments and future directions

Adjustment for Biased Sampling Using NHANES Derived Propensity Weights

A comparison of imputation techniques in the third national health and nutrition examination survey

Democratizing Native Hawaiian and Pacific Islander Data: Examining Community Accessibility of Data for Health and the Social Drivers of Health

Characterizing nutrient patterns of food items in adolescent diet using data from a novel citizen science project and the US National Health and Nutrition Examination Survey (NHANES)

Transparency in epidemiological analyses of cohort data - A case study of the Norwegian Mother, Father, and Child cohort study (MoBa)

The Design and Implementation of the 2016 National Survey of Children’s Health

Harnessing the Potential of the American Community Survey: Delving into Methods of Data Delivery

rEHR: An R package for manipulating and analysing Electronic Health Record data