Informatics-driven unsupervised learning of comorbidity clusters for COVID-19 reinfection risk: A finite mixture modeling approach
Department
Neurology
Document Type
Article
Publication Title
Informatics in Medicine Unlocked
Abstract
Purpose: This study applied an informatics-focused, unsupervised learning framework (finite mixture modeling) to determine whether distinct clusters of coexisting conditions among patients with coronavirus disease 2019 (COVID-19) are associated with multiple (reinfection) versus single infections.
Methods: We analyzed 42,974 patient records containing COVID-19 diagnoses using an machine learning classification algorithm to identify comorbidity profiles. Of nearly 850 recorded conditions, 29 were retained if they occurred in at least 5 % of the sample. We then compared patients with single versus multiple COVID-19 diagnoses within each profile.
Results: Three comorbidity profiles emerged. The first profile (Minimal Comorbidity) was the largest (67 % of sample) and was characterized by few additional conditions. Patients classified into this profile were also 20–30 years younger, on average, than members of the other profiles. The second (Elevated Select Comorbidity) profile consisted of 24 % of the sample and was characterized by moderate-risk factors such as hypertension, hyperlipidemia, and acute respiratory failure. The third (High Comorbidity Burden) third was represented by 9 % of the sample and was characterized by conditions related to cardiovascular, renal, endocrine, and respiratory systems. Among the high-burden group, 30 % experienced reinfection, versus only 9 % in the minimal group. Overall, patients with more extensive cardiometabolic or pulmonary conditions were more likely to experience repeated infection.
Conclusions: By identifying and characterizing comorbidity clusters, this informatics-based approach offers deeper insight into COVID-19 reinfection dynamics. The findings may support targeted prevention, data-driven resource allocation, and precision medicine strategies by highlighting subgroups at elevated risk. Moreover, the unsupervised modeling framework is potentially adaptable to other multifactorial conditions, underscoring its broader utility in medical informatics.
First Page
101649
DOI
10.1016/j.imu.2025.101649
Volume
55
Publication Date
5-2025
Recommended Citation
Morgan, G. B., Stamatis, A., Yager, C. C., & Boolani, A. (2025). Informatics-driven unsupervised learning of comorbidity clusters for COVID-19 reinfection risk: A finite mixture modeling approach. Informatics in Medicine Unlocked, 55, 101649. https://doi.org/10.1016/j.imu.2025.101649