Phenotypes and Subphenotypes of Patients with COVID-19

Xiaofeng Wang,Lara Jehi,Xinge Ji,Peter J. Mazzone
DOI: https://doi.org/10.1016/j.chest.2021.01.057
IF: 9.6
2021-01-01
Chest
Abstract:BackgroundSince COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.Research QuestionDoes the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes?Study Design and MethodsWe included adult patients (≥ 18 years) positive for laboratory-confirmed SARS-CoV-2 infection from a prospective COVID-19 registry database in the Cleveland Clinic Health System in Ohio and Florida. The patients were split into training and testing sets. Using latent class analysis (LCA), we first identified phenotypic clusters of patients with COVID-19 based on demographics, comorbidities, and presenting symptoms. We then identified subphenotypes of hospitalized patients with additional blood biomarker data measured on hospital admission. The associations of phenotypes/subphenotypes and clinical outcomes were investigated. Multivariable prediction models were established to predict assignment to the LCA-defined phenotypes and subphenotypes and then evaluated on an independent testing set.ResultsWe analyzed data for 20,572 patients. Seven phenotypes were identified on the basis of different profiles of presenting COVID-19 symptoms and existing comorbidities, including the following groups: young, no symptoms; young, symptoms; middle-aged, no symptoms; middle-aged, symptoms; middle-aged, comorbidities; old, no symptoms; and old, symptoms. The rates of inpatient hospitalization for the phenotypes were significantly different (P < .001). Five subphenotypes were identified for the subgroup of hospitalized patients, including the following subgroups: young, elevated WBC and platelet counts; middle-aged, lymphopenic with elevated C-reactive protein; middle-aged, hyperinflammatory; old, leukopenic with comorbidities; and old, hyperinflammatory with kidney dysfunction. The hospital mortality and the times from hospitalization to ICU transfer or death were significantly different (P < .001). The models for predicting the LCA-defined phenotypes and subphenotypes showed high discrimination (concordance index, 0.92 and 0.91).InterpretationHypothesis-free LCA-defined phenotypes and subphenotypes of patients with COVID-19 can be identified. These may help clinical investigators conduct stratified analyses in clinical trials and assist basic science researchers in characterizing the pathobiology of the spectrum of COVID-19 presentations. Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts. Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? We included adult patients (≥ 18 years) positive for laboratory-confirmed SARS-CoV-2 infection from a prospective COVID-19 registry database in the Cleveland Clinic Health System in Ohio and Florida. The patients were split into training and testing sets. Using latent class analysis (LCA), we first identified phenotypic clusters of patients with COVID-19 based on demographics, comorbidities, and presenting symptoms. We then identified subphenotypes of hospitalized patients with additional blood biomarker data measured on hospital admission. The associations of phenotypes/subphenotypes and clinical outcomes were investigated. Multivariable prediction models were established to predict assignment to the LCA-defined phenotypes and subphenotypes and then evaluated on an independent testing set. We analyzed data for 20,572 patients. Seven phenotypes were identified on the basis of different profiles of presenting COVID-19 symptoms and existing comorbidities, including the following groups: young, no symptoms; young, symptoms; middle-aged, no symptoms; middle-aged, symptoms; middle-aged, comorbidities; old, no symptoms; and old, symptoms. The rates of inpatient hospitalization for the phenotypes were significantly different (P < .001). Five subphenotypes were identified for the subgroup of hospitalized patients, including the following subgroups: young, elevated WBC and platelet counts; middle-aged, lymphopenic with elevated C-reactive protein; middle-aged, hyperinflammatory; old, leukopenic with comorbidities; and old, hyperinflammatory with kidney dysfunction. The hospital mortality and the times from hospitalization to ICU transfer or death were significantly different (P < .001). The models for predicting the LCA-defined phenotypes and subphenotypes showed high discrimination (concordance index, 0.92 and 0.91). Hypothesis-free LCA-defined phenotypes and subphenotypes of patients with COVID-19 can be identified. These may help clinical investigators conduct stratified analyses in clinical trials and assist basic science researchers in characterizing the pathobiology of the spectrum of COVID-19 presentations. The clinical spectrum of COVID-19 is broad, ranging from asymptomatic infection to severe pneumonia with respiratory failure. Many studies have reported on the clinical characteristics of COVID-19.1Grasselli G. Zangrillo A. Zanella A. et al.COVID-19 Lombardy ICU Network. Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy region, Italy.JAMA. 2020; 323: 1574-1581Crossref PubMed Scopus (3793) Google Scholar, 2Richardson S. Hirsch J.S. Narasimhan M. et al.Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City area.JAMA. 2020; 323: 2052-2059Crossref PubMed Scopus (6380) Google Scholar, 3Wang D. Hu B. Hu C. et al.Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China.JAMA. 2020; 323: 1061-1069Crossref PubMed Scopus (16040) Google Scholar, 4Zhou F. Yu T. Du R. et al.Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study.Lancet. 2020; 395: 1054-1062Abstract Full Text Full Text PDF PubMed Scopus (18531) Google Scholar, 5Guan W.-J. Ni Z.-Y. Hu Y. et al.China Medical Treatment Expert Group for Covid-19Clinical characteristics of coronavirus disease 2019 in China.N Engl J Med. 2020; 382: 1708-1720Crossref PubMed Scopus (20098) Google Scholar The largest cohort study reported to date included 72,314 cases from the Chinese Center for Disease Control and Prevention.6Wu Z. McGoogan J.M. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72314 cases from the Chinese Center for Disease Control and Prevention.JAMA. 2020; 323: 1239-1242Crossref PubMed Scopus (12462) Google Scholar In this study, 81% of patients had mild to moderate disease (mild symptoms up to mild pneumonia), 14% had severe disease (dyspnea, hypoxia, or > 50% lung involvement on imaging), and 5% had critical disease (respiratory failure, shock, or multiorgan system dysfunction). These findings drew early broad brushstrokes of the clinical disease spectrum. A more complex picture later emerged, with symptoms extending beyond the respiratory system, and unpredictable clinical deterioration in patients initially thought to have mild disease. This uncovered a need for a more sophisticated classification mirroring the heterogeneous clinical progression. In parallel, several efforts have evaluated laboratory-based disease biomarkers. Guan et al5Guan W.-J. Ni Z.-Y. Hu Y. et al.China Medical Treatment Expert Group for Covid-19Clinical characteristics of coronavirus disease 2019 in China.N Engl J Med. 2020; 382: 1708-1720Crossref PubMed Scopus (20098) Google Scholar found that elevated serum alanine aminotransferase and aspartate aminotransferase levels, lactate dehydrogenase, C-reactive protein (CRP), and ferritin levels may be associated with greater illness severity manifest by ARDS and acute kidney injury, resulting in a higher mortality rate. The clinical and biological heterogeneity of COVID-197Rello J. Storti E. Belliato M. Serrano R. Clinical phenotypes of SARS-CoV-2: implications for clinicians and researchers.Eur Respir J. 2020; 55: 2001028Crossref PubMed Scopus (114) Google Scholar,8Gattinoni L. Chiumello D. Caironi P. et al.COVID-19 pneumonia: different respiratory treatments for different phenotypes?.Intensive Care Med. 2020; 46: 1099-1102Crossref PubMed Scopus (1223) Google Scholar has led to an incomplete categorization of disease phenotypes. In contrast, research in cancer, ARDS, and asthma has been able to identify disease phenotypes with important therapeutic implications.10Sakr L. Small D. Kasymjanova G. Suissa S. Ernst P. Phenotypic heterogeneity of potentially curable non-small-cell lung cancer: cohort study with cluster analysis.J Thorac Oncol. 2015; 10: 754-761Abstract Full Text Full Text PDF PubMed Scopus (9) Google Scholar, 11Calfee C.S. Delucchi K. Parsons P.E. et al.Subphenotypes in acute respiratory distress syndrome: latent class analysis of data from two randomised controlled trials.Lancet Respir Med. 2014; 2: 611-620Abstract Full Text Full Text PDF PubMed Scopus (882) Google Scholar, 9Depner M. Fuchs O. Genuneit J. et al.Clinical and epidemiologic phenotypes of childhood asthma.Am J Respir Crit Care Med. 2014; 189: 129-138Crossref PubMed Scopus (158) Google Scholar At present, a variety of therapeutic options for COVID-19 are under investigation. Recognizing different phenotypes and subphenotypes of COVID-19 might help guide basic, clinical, and translational research efforts. Such an understanding could help clinicians and researchers stratify patients for clinical trials and customize therapy. Latent class analysis (LCA), a subset of structural equation modeling, is a well-validated statistical technique for identifying unmeasured class membership among subjects, using categorical and/or continuous observed variables.12Hagenaars J.A. McCutcheon A.L. Applied Latent Class Analysis. Cambridge University Press, Cambridge2002Crossref Google Scholar It uses mixture modeling to find the best-fitting model under the assumption that individuals can be divided into subgroups based on an unobservable construct. The subgroups are called latent classes. LCA is an unsupervised analysis in that it asks whether there are subgroups of individuals defined by a combination of variables, without mandating consideration of an outcome. LCA has been successfully used in respiratory and critical care medicine, for example, in the identification of phenotypes of childhood asthma9Depner M. Fuchs O. Genuneit J. et al.Clinical and epidemiologic phenotypes of childhood asthma.Am J Respir Crit Care Med. 2014; 189: 129-138Crossref PubMed Scopus (158) Google Scholar and subphenotypes of ARDS.11Calfee C.S. Delucchi K. Parsons P.E. et al.Subphenotypes in acute respiratory distress syndrome: latent class analysis of data from two randomised controlled trials.Lancet Respir Med. 2014; 2: 611-620Abstract Full Text Full Text PDF PubMed Scopus (882) Google Scholar Despite widespread recognition of the heterogeneity within COVID-19, little work has robustly studied if/what subphenotypes exist. This article aims to take advantage of the wealth of clinical and blood biomarker data available from the prospective COVID-19 Registry of the Cleveland Clinic Health System13Jehi L. Ji X. Milinovich A. et al.Individualizing risk prediction for positive COVID-19 testing: results from 11,672 patients.Chest. 2020; 158: 1364-1375Abstract Full Text Full Text PDF PubMed Scopus (138) Google Scholar by using latent class modeling approaches to identify phenotypes and subphenotypes of patients with COVID-19 and to test their association with clinical outcomes. A prospective COVID-19 registry database was set up in March 2020 to align data collection for research with clinical care of all patients who are tested for COVID-19 in the Cleveland Clinic Health System. Data capture was facilitated by creating standardized clinical templates that are implemented across the health care system.13Jehi L. Ji X. Milinovich A. et al.Individualizing risk prediction for positive COVID-19 testing: results from 11,672 patients.Chest. 2020; 158: 1364-1375Abstract Full Text Full Text PDF PubMed Scopus (138) Google Scholar Study data were collected and managed using Research Electronic Data Capture (REDCap; Vanderbilt University) tools hosted at the Cleveland Clinic. Registry variables were chosen to reflect the available literature on COVID-19 disease characterization, progression, and treatment. The study was approved by the Cleveland Clinic Institutional Review Board. The requirement for written informed consent was waived. We included adult patients (age, ≥ 18 years) with confirmed SARS-CoV-2 infection in the Cleveland Clinic Health System in the United States between March 12 and October 31, 2020. COVID-19 was confirmed by reverse transcription-polymerase chain reaction for SARS-CoV-2. The testing protocols were previously described.13Jehi L. Ji X. Milinovich A. et al.Individualizing risk prediction for positive COVID-19 testing: results from 11,672 patients.Chest. 2020; 158: 1364-1375Abstract Full Text Full Text PDF PubMed Scopus (138) Google Scholar Patient demographics, comorbidities, presenting symptoms, and medications were retrieved and analyzed. Data on comorbidities including cancer, COPD/emphysema, asthma, diabetes mellitus, hypertension, coronary artery disease, transplantation, multiple sclerosis, inflammatory bowel disease, and immunosuppressive disease were extracted from the electronic health record (Epic; Epic Systems Corporation). Immunosuppressive disease was defined on the basis of the Agency for Healthcare Research and Quality patient safety indicator (Appendix I: Immunocompromised State Diagnosis and Procedure Codes14Agency for Healthcare Research and Quality (AHRQ)AHRQ Quality Indicators: Patient Safety Indicators (PSI) Appendix I: Immunocompromised State Diagnosis and Procedure Codes.https://qualityindicators.ahrq.gov/ICD10/default.aspxDate accessed: February 5, 2021Google Scholar). For hospitalized patients, routine blood examination results from hospital admission were extracted, including CBC count, basic metabolic panel, coagulation profile, and renal and liver function tests. We excluded blood biomarkers if they were missing for ≥ 30% of subjects. We finally selected 17 candidate blood biomarkers based on the current literature.1Grasselli G. Zangrillo A. Zanella A. et al.COVID-19 Lombardy ICU Network. Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy region, Italy.JAMA. 2020; 323: 1574-1581Crossref PubMed Scopus (3793) Google Scholar, 2Richardson S. Hirsch J.S. Narasimhan M. et al.Presenting characteristics, comorbidities, and outcomes among 5700 patients hospitalized with COVID-19 in the New York City area.JAMA. 2020; 323: 2052-2059Crossref PubMed Scopus (6380) Google Scholar, 3Wang D. Hu B. Hu C. et al.Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China.JAMA. 2020; 323: 1061-1069Crossref PubMed Scopus (16040) Google Scholar, 4Zhou F. Yu T. Du R. et al.Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study.Lancet. 2020; 395: 1054-1062Abstract Full Text Full Text PDF PubMed Scopus (18531) Google Scholar, 5Guan W.-J. Ni Z.-Y. Hu Y. et al.China Medical Treatment Expert Group for Covid-19Clinical characteristics of coronavirus disease 2019 in China.N Engl J Med. 2020; 382: 1708-1720Crossref PubMed Scopus (20098) Google Scholar, 6Wu Z. McGoogan J.M. Characteristics of and important lessons from the coronavirus disease 2019 (COVID-19) outbreak in China: summary of a report of 72314 cases from the Chinese Center for Disease Control and Prevention.JAMA. 2020; 323: 1239-1242Crossref PubMed Scopus (12462) Google Scholar They represented a diverse range of biological processes, including absolute lymphocyte count, absolute neutrophil count, albumin, alanine aminotransferase, alkaline phosphatase, BUN, chloride, CRP, creatinine, ferritin, hematocrit, hemoglobin, platelets, potassium, RBC distribution width, total bilirubin, and WBC count. For all patients who tested positive for SARS-CoV-2, the primary outcome measured was the need for inpatient hospitalization. For the subgroup cohort of hospitalized patients, the primary outcome measured was a composite of transfer to an ICU or in-hospital mortality (World Health Organization [WHO] COVID-19 Ordinal Scale for Clinical Improvement, scores 6-8) vs no transfer to ICU and alive (WHO Ordinal Scale for Clinical Improvement, scores 3-5) during hospitalization. The secondary outcome for hospitalized patients was in-hospital mortality only. Patients still hospitalized at the time of analysis were censored. The outcomes for all patients were followed up until December 5, 2020. The study patients were split into training and testing sets based on the time their information was entered into the database. Patients whose data were collected before August 28, 2020, were considered the training sample, whereas patients whose data were collected between August 29 and October 31, 2020, were considered the testing sample. The study variables were described using sample median with interquartile range or number with proportion. Categorical variables were compared using the Pearson χ2 test or Fisher exact test, whereas continuous variables were compared using the Mann-Whitney U test. We developed two latent class models in this study: the first one was for all patients who tested positive; the second one was for hospitalized patients only. The candidate class-defining variables in the first model included demographics, comorbidities, and presenting symptoms. BMI was excluded as it was missing for 47% of the cohort. The candidate class-defining variables in the second model included five demographic/clinical variables (age, sex, the presence of cancer, the presence of COPD/emphysema, and the number of other comorbidities), and the 17 blood biomarker measures at hospital admission (described in the section "Data Extraction"). The comorbidities selected were based on available literature showing that patients infected with SARS-CoV-2 and who had cancer or COPD had poor outcomes with a high occurrence of clinically severe events and mortality.15Zhang L. Zhu F. Xie L. et al.Clinical characteristics of COVID-19-infected cancer patients: a retrospective case study in three hospitals within Wuhan, China.Ann Oncol. 2020; 31: 894-901Abstract Full Text Full Text PDF PubMed Scopus (1100) Google Scholar,16Lippi G. Henry B.M. Chronic obstructive pulmonary disease is associated with severe coronavirus disease 2019 (COVID-19).Respir Med. 2020; 167: 105941Abstract Full Text Full Text PDF PubMed Scopus (247) Google Scholar Missing value imputation for biomarkers was compared using median imputation and multivariate imputation by chained equations, and the median imputation was used in the analysis. Log-transformation was performed on continuous variables, if needed, to reduce or remove the skewness of the original data. Standardization was then processed to make all continuous variables fit on the same scale. Latent class variable selection analysis was performed on the basis of the Fop et al17Fop M. Smart K.M. Murphy T.B. Variable selection for latent class analysis with application to low back pain diagnosis.Ann Appl Stat. 2017; 11: 2080-2110Crossref Scopus (33) Google Scholar method, because removing unnecessary variables and parameters can improve classification performance and the precision of parameter estimates. Parameters of the latent class models were estimated using expectation-maximization methods.18Visser I. Speekenbrink M. depmixS4: an R package for hidden Markov models.J Stat Softw. 2010; 36: 1-21Crossref Scopus (260) Google Scholar Latent classes were determined without consideration of clinical outcomes. The best-fitting models were determined on the basis of the Bayesian information criterion (BIC). Individuals were then assigned to the class for which they had the highest posterior probability of belonging. Kruskal-Wallis, Pearson χ2, and Fisher exact tests, as appropriate, were performed to compare the clinical, laboratory, and outcome variables between the resulting classes. Kaplan-Meier survival analysis was conducted to describe survival characteristics among the different subclasses in hospitalized patients. Finally, we built multinomial logistic regression models by the bias correction method19Bolck A. Croon M. Hagenaars J. Estimating latent structure models with categorical variables: one-step versus three-step estimators.Pol Anal. 2004; 12: 3-27Crossref Scopus (727) Google Scholar,20Vermunt J.K. Latent class modeling with covariates: two improved three-step approaches.Pol Anal. 2010; 18: 450-469Crossref Scopus (1418) Google Scholar to predict latent class membership. Bootstrap internal validation was conducted to assess the discriminative ability of the models. All analyses were performed with the R software program (version 3.6.3; R Foundation for Statistical Computing) and SAS software (version 9.4; SAS Institute). The level of statistical significance was set at P < .05 (two-tailed). There were 285,783 patients who presented with symptoms of respiratory tract infection or had close contact with someone with confirmed COVID-19 from March 12 to October 31, 2020, and who underwent SARS-CoV-2 testing. Of the 21,978 patients who tested positive for SARS-CoV-2, 20,572 adult patients served as the study population (Fig 1). The patients were split into training and testing sets, with 11,818 patients included in the model training set and 8,754 patients serving as the testing cohort. A total of 3,546 patients (2,655 and 893 patients in the training and testing sets, respectively) were admitted to a hospital in the Cleveland Clinic Health System, which served as the subpopulation in the subgroup analysis. Baseline demographic and clinical characteristics of the training and testing cohorts in the study are presented in e-Tables 1 through 4. The best-fitting model, as indicated by the BIC, was a seven-class model using one demographic variable, six symptom variables, and seven comorbidity variables: the demographic variable was age; the symptom variables included the presences of cough, fever, fatigue, sputum production, shortness of breath, and diarrhea; the comorbidity variables included the presence of asthma, COPD/emphysema, diabetes, hypertension, coronary artery disease, heart failure, cancer, and immunosuppressive disease. Table 1 displays the group comparison results to help us understand the clinical characteristics that distinguished each phenotype.Table 1Comparison of Phenotypes of Adult Patients With Confirmed COVID-19 Identified by Latent Class AnalysisVariableOverall (N = 11,818)Class 1 (n = 3,803)Class 2 (n = 1,594)Class 3 (n = 988)Class 4 (n = 2,503)Class 5 (n = 1,310)Class 6 (n = 926)Class 7 (n = 694)P ValueDemographics Age, median (IQR), y50 (33-65)33 (26-44)34 (26-44)56 (45-66)61 (52-71)59 (50-67)74 (66-83)76 (66-86)< .001 Sex, No. (%)< .001Female6,500 (55)2,202 (58)887 (56)498 (50)1,402 (56)713 (54)454 (49)344 (50)Male5,318 (45)1,601 (42)707 (44)490 (50)1,101 (44)597 (46)472 (51)350 (50) Race, No. (%)< .001White6,484 (55)1,919 (50)808 (51)569 (58)1,470 (59)684 (52)587 (63)447 (64)Black3,583 (30)1,156 (30)468 (29)251 (25)745 (30)470 (36)282 (30)211 (30)Other1,751 (15)728 (19)318 (20)168 (17)288 (12)156 (12)57 (6.2)36 (5.2) Ethnicity, No. (%)< .001Hispanic1,440 (12)574 (15)236 (15)139 (14)278 (11)144 (11)44 (4.8)25 (3.6)Non-Hispanic9,231 (78)2,727 (72)1,137 (71)721 (73)2,056 (82)1,082 (83)858 (93)650 (94)Unknown1,147 (9.7)502 (13)221 (14)128 (13)169 (6.8)84 (6.4)24 (2.6)19 (2.7) Smoking, No. (%)< .001Current smoker874 (7.4)277 (7.3)121 (7.6)60 (6.1)185 (7.4)103 (7.9)70 (7.6)58 (8.4)Former smoker2,834 (24)414 (11)283 (18)309 (31)624 (25)405 (31)466 (50)333 (48)Nonsmoker6,139 (52)2,063 (54)1,004 (63)444 (45)1,267 (51)739 (56)348 (38)274 (39)Unknown1,971 (17)1,049 (28)186 (12)175 (18)427 (17)63 (4.8)42 (4.5)29 (4.2)Presenting symptom, No. (%) Cough4,164 (35)139 (3.7)1,357 (85)958 (97)0 (0)1,077 (82)28 (3.0)605 (87)< .001 Fever3,286 (28)67 (1.8)1,032 (65)904 (91)1 (< 0.1)801 (61)33 (3.6)448 (65)< .001 Fatigue3,334 (28)43 (1.1)1,015 (64)988 (100)14 (0.6)653 (50)26 (2.8)595 (86)< .001 Sputum production2,381 (20)0 (0)653 (41)894 (90)0 (0)351 (27)0 (0)483 (70)< .001 Flu-like symptoms3,918 (33)202 (5.3)1,276 (80)951 (96)22 (0.9)907 (69)18 (1.9)542 (78)< .001 Shortness of breath2,682 (23)11 (0.3)715 (45)835 (85)0 (0)494 (38)49 (5.3)578 (83)< .001 Diarrhea2,178 (18)20 (0.5)573 (36)876 (89)2 (< 0.1)265 (20)15 (1.6)427 (62)< .001 Vomiting1,546 (13)34 (0.9)340 (21)689 (70)0 (0)168 (13)7 (0.8)308 (44)< .001Comorbidities, No. (%) Asthma1,727 (15)471 (12)225 (14)119 (12)305 (12)219 (17)241 (26)147 (21)< .001 COPD/emphysema725 (6.1)12 (0.3)0 (0)32 (3.2)100 (4.0)45 (3.4)310 (33)226 (33)< .001 Diabetes2,108 (18)44 (1.2)18 (1.1)138 (14)570 (23)452 (35)527 (57)359 (52)< .001 Hypertension4,638 (39)73 (1.9)1 (< 0.1)384 (39)1,607 (64)1,014 (77)898 (97)661 (95)< .001 Coronary artery disease1,120 (9.5)0 (0)0 (0)13 (1.3)116 (4.6)82 (6.3)543 (59)366 (53)< .001 Heart failure879 (7.4)10 (0.3)0 (0)1 (0.1)32 (1.3)36 (2.7)458 (49)342 (49)< .001 Cancer1,208 (10)35 (0.9)12 (0.8)78 (7.9)377 (15)192 (15)297 (32)217 (31)< .001 Transplant history90 (0.8)2 (< 0.1)3 (0.2)5 (0.5)17 (0.7)11 (0.8)32 (3.5)20 (2.9)< .001 Multiple sclerosis78 (0.7)15 (0.4)8 (0.5)7 (0.7)20 (0.8)15 (1.1)6 (0.6)7 (1.0).075 Connective tissue disease540 (4.6)30 (0.8)45 (2.8)55 (5.6)75 (3.0)139 (11)87 (9.4)109 (16)< .001 Inflammatory bowel disease243 (2.1)53 (1.4)35 (2.2)35 (3.5)27 (1.1)39 (3.0)21 (2.3)33 (4.8)< .001 Immunosuppressive disease1,208 (10)83 (2.2)36 (2.3)61 (6.2)221 (8.8)151 (12)420 (45)236 (34)< .001Outcome, No. (%) Hospitalization2,655 (22)375 (9.9)156 (9.8)221 (22)612 (24)434 (33)448 (48)409 (59)< .001P values are based on Kruskal-Wallis test, Pearson χ2 test, or Fisher exact test as appropriate. IQR = interquartile range. Open table in a new tab P values are based on Kruskal-Wallis test, Pearson χ2 test, or Fisher exact test as appropriate. IQR = interquartile range. Phenotype class 1 (young, no symptoms; 32% of the sample) and phenotype class 2 (young, symptoms; 14% of the sample) contained mostly young adults, with median ages 33 and 34 years, respectively. Patients in class 1 had very few COVID-19 symptoms and comorbidities. Patients in class 2 also had few comorbidities but did have typical COVID-19 symptoms, such as cough (85%) and fever (65%). Phenotype class 3 (middle-aged, symptoms; 8% of the sample), phenotype class 4 (middle-aged, no symptoms; 21% of the sample), and phenotype class 5 (middle-aged, comorbidities; 11% of the sample) were more heavily populated with individuals with median ages 56, 61, and 59 years, respectively. Patients in class 3 had almost all of the common COVID-19 symptoms, whereas patients in class 4 had nearly no COVID-19 symptoms. In contrast, in class 5 some symptoms were common (cough, 82%; fever, 61%; fatigue, 50%), but some were not (sputum production, 27%; shortness of breath, 38%; diarrhea, 20%, vomiting, 13%). Comorbidities were more common in the three middle-aged groups than in the young adult classes. The frequencies of hypertension, diabetes, and cancer in class 4 and class 5 were much higher than in class 3. Phenotype class 6 (old, no symptoms; 8% of the sample) and phenotype class 7 (old, symptoms; 6% of the sample) were populated with individuals with median ages 74 and 76 years, respectively. Patients in class 6 had few COVID-19 symptoms, whereas those in class 7 had almost all of the common symptoms. Both classes had high frequencies of comorbidities. The χ2 analysis showed a significant association between the LCA-defined phenotypes and inpatient hospitalization (P < .001). The rates of inpatient hospitalization for the seven phenotypes were 9.7%, 9.8%, 22%, 24%, 33%, 47%, and 59%, respectively. The best-fitting model, as indicated by the BIC, was a five-class model using four clinical variables and seven blood biomarkers. The clinical variables included age, the presence of cancer, the presence of COPD/emphysema, and the number of other comorbidities, and the blood biomarkers included WBC count, lymphocyte count, CRP, creatinine, albumin, platelet count, and hemoglobin. Table 2 displays the class comparison results to illustrate the clinical and biological characteristics that distinguished each subphenotype. Figure 2 shows the latent profile plots of the subphenotypes identified.Table 2Comparison of Subphenotypes for Hospitalized Patients With COVID-19, Identified by Latent Class AnalysisVariableOverall (N = 2,655)No. of Patients With Missing DataSubclass 1 (n = 363)Subclass 2 (n = 568)Subclass 3 (n = 524)Subclass 4 (n = 672)Subclass 5 (n = 5,281)P ValueDemographics Age, median (IQR), y63 (51-75)040 (28-53)54 (42-62)62 (53-71)75 (66-84)71 (62-80)< .001 Sex, No. (%)0< .001Female1,320 (50)223 (61)271 (48)244 (47)336 (50)246 (47)Male1,335 (50)140 (39)297 (52)280 (53)336 (50)282 (53) Race, No. (%)0< .001White1,368 (52)141 (39)293 (52)277 (53)372 (55)285 (54)Black1,083 (41)188 (52)208 (37)200 (38)266 (40)221 (42)Other204 (7.7)34 (9.4)67 (12)47 (9.0)34 (5.1)22 (4.2) Ethnicity, No. (%)0< .001Hispanic226 (8.5)34 (9.4)76 (13)58 (11)35 (5.2)23 (4.4)Non-Hispanic2,388 (90)321 (88)486 (86)451 (86)631 (94)499 (95)Unknown41 (1.5)8 (2.2)6 (1.1)15 (2.9)6 (0.9)6 (1.1) Smoking, No. (%)0< .001Current smoker192 (7.2)47 (13)36 (6.3)23 (4.4)38 (5.7)48 (9.1)Former smoker878 (33)73 (20)127 (22)152 (29)328 (49)198 (38)Nonsmoker1,2
What problem does this paper attempt to address?