A machine learning approach identifies 5-ASA and ulcerative colitis as being linked with higher COVID-19 mortality in patients with IBD

Sci Rep. 2021 Aug 13;11(1):16522. doi: 10.1038/s41598-021-95919-2.

Satyaki Roy 1Shehzad Z Sheikh 2Terrence S Furey 3


Author information

  • 1Department of Genetics, University of North Carolina, Chapel Hill, USA.
  • 2Departments of Medicine and Genetics, Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, USA. shehzad_sheikh@med.unc.edu.
  • 3Departments of Genetics and Biology, Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, USA. tsfurey@email.unc.edu.


Inflammatory bowel diseases (IBD), namely Crohn's disease (CD) and ulcerative colitis (UC) are chronic inflammation within the gastrointestinal tract. IBD patient conditions and treatments, such as with immunosuppressants, may result in a higher risk of viral and bacterial infection and more severe outcomes of infections. The effect of the clinical and demographic factors on the prognosis of COVID-19 among IBD patients is still a significant area of investigation. The lack of available data on a large set of COVID-19 infected IBD patients has hindered progress. To circumvent this lack of large patient data, we present a random sampling approach to generate clinical COVID-19 outcomes (outpatient management, hospitalized and recovered, and hospitalized and deceased) on 20,000 IBD patients modeled on reported summary statistics obtained from the Surveillance Epidemiology of Coronavirus Under Research Exclusion (SECURE-IBD), an international database to monitor and report on outcomes of COVID-19 occurring in IBD patients. We apply machine learning approaches to perform a comprehensive analysis of the primary and secondary covariates to predict COVID-19 outcome in IBD patients. Our analysis reveals that age, medication usage and the number of comorbidities are the primary covariates, while IBD severity, smoking history, gender and IBD subtype (CD or UC) are key secondary features. In particular, elderly male patients with ulcerative colitis, several preexisting conditions, and who smoke comprise a highly vulnerable IBD population. Moreover, treatment with 5-ASAs (sulfasalazine/mesalamine) shows a high association with COVID-19/IBD mortality. Supervised machine learning that considers age, number of comorbidities and medication usage can predict COVID-19/IBD outcomes with approximately 70% accuracy. We explore the challenge of drawing demographic inferences from existing COVID-19/IBD data. Overall, there are fewer IBD case reports from US states with poor health ranking hindering these analyses. Generation of patient characteristics based on known summary statistics allows for increased power to detect IBD factors leading to variable COVID-19 outcomes. There is under-reporting of COVID-19 in IBD patients from US states with poor health ranking, underpinning the perils of using the repository to derive demographic information.

© Copyright 2013-2022 GI Health Foundation. All rights reserved.
This site is maintained as an educational resource for US healthcare providers only. Use of this website is governed by the GIHF terms of use and privacy statement.