Performance Evaluation of the Verily Numetric Watch sleep suite for digital sleep assessment against in-lab polysomnography

Benjamin W Nelson,Sohrab Saeb,Poulami Barman,Nishant Verma,Hannah Allen,Massimiliano de Zambotti,Fiona C. Baker,Nicole Arra,Niranjan Sridhar,Shannon Sullivan,Scooter Plowman,Erin Rainaldi,Ritu Kapur,Sooyoon Shin
DOI: https://doi.org/10.1101/2024.09.10.24313425
2024-09-11
Abstract:The goal was to evaluate the performance of a multi-sensor wrist-worn wearable device for generating 12 sleep measures in a diverse cohort. Our study technology was the sleep suite of the Verily Numetric Watch (VNW), using polysomnography (PSG) as reference during 1-night simultaneous recording in a sample of N=41 (18 male, age range: 18-78 years). We performed epoch-by-epoch comparisons for all measures. Key specific analyses were: core accuracy metrics for sleep vs wake classification; bias for continuous measures (Bland-Altman); Cohens kappa and accuracy for sleep stage classifications; and mean count difference and linearly weighted Cohens kappa for count metric. In addition, we performed subgroup analyses by sex, age, skin tone, body mass index, and arm hair density. Sensitivity and specificity (95% CI) of sleep versus wake classification were 0.97 (0.96, 0.98) and 0.66 (0.61, 0.71), respectively. Mean total sleep time bias was 14.55 minutes (1.61, 27.16); wake after sleep onset, -11.77 minutes (-23.89, 1.09); sleep efficiency, 3.15% (0.68, 5.57); sleep onset latency, -3.24 minutes (-9.38, 3.57); light-sleep duration, 3.78 minutes (-7.04, 15.06); deep-sleep duration, 3.91 minutes (-4.59, 12.60); rapid eye movement-sleep duration, 6.94 minutes (0.57, 13.04). Median difference for number of awakenings, 0.00 (0.00, 1.00); and overall accuracy of sleep stage classification, 0.78 (0.51, 0.88). Most measures showed statistically significant proportional biases and/or heteroscedasticity. Subgroup results appeared largely consistent with the overall group, although small samples preclude strong conclusions. These results support the use of VNWs in classifying sleep versus wake, sleep stages, and for related overnight sleep measures.
What problem does this paper attempt to address?