Biomedical Statistics and Informatics

| Peer-Reviewed |

Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning

Received: Oct. 10, 2019    Accepted: Nov. 18, 2019    Published: Nov. 22, 2019
Views:       Downloads:

Share This Article

Abstract

ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU readmission can help prevent patients from worse situation and lead to lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning techniques manipulating on healthcare large scale data. To this end, we designed data-driven predictive models to estimate the risk of Intensive Care Unit (ICU) readmission. The discharge summary of each hospital admission was carefully represented by natural language processing algorithms. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers including naïve Bayes, support vector machine, logistic regression, gradient boosting decision tree and 2 feature representations including Bag-of-Words and Bag-of-CUIs were adopted to construct predictive configurations. The best configuration yielded a competitive AUC of 0.748. High contribution words and medical terms were further investigated to ensure that they were clinical meaningful. A comparative study between two feature representations were also discussed. Our work suggests that natural language processing of discharge summaries is capable to extract meaningful information from discharge summary automatically and to send clinicians the warning of unplanned 30-day readmission upon discharge.

DOI 10.11648/j.bsi.20190403.11
Published in Biomedical Statistics and Informatics ( Volume 4, Issue 3, September 2019 )
Page(s) 22-26
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

ICU Readmission, Machine Learning, Natural Language Processing, Unified Medical Language System (UMLS)

References
[1] Pronovost, P. J., et al., Developing and pilot testing quality indicators in the intensive care unit. Journal of critical care, 2003. 18 (3): p. 145-155.
[2] Johnson, A. E., et al., MIMIC-III, a freely accessible critical care database. Scientific data, 2016. 3: p. 160035.
[3] Higgins, T. L., et al., Assessing contemporary intensive care unit outcome: an updated Mortality Probability Admission Model (MPM0-III). Critical care medicine, 2007. 35 (3): p. 827-835.
[4] Groenewegen, K. H., A. M. Schols, and E. F. Wouters, Mortality and mortality-related factors after hospitalization for acute exacerbation of COPD. Chest, 2003. 124 (2): p. 459-467.
[5] Makris, N., et al., Unplanned early readmission to the intensive care unit: a case-control study of patient, intensive care and ward-related factors. Anaesthesia and intensive care, 2010. 38 (3): p. 723-731.
[6] Tang, P. C., et al., Personal health records: definitions, benefits, and strategies for overcoming barriers to adoption. Journal of the American Medical Informatics Association, 2006. 13 (2): p. 121-126.
[7] Burton, L. C., G. F. Anderson, and I. W. Kues, Using electronic health records to help coordinate care. The Milbank Quarterly, 2004. 82 (3): p. 457-481.
[8] Sox, H. C., et al., Medical decision making. 2007: ACP Press.
[9] Bellazzi, R. and B. Zupan, Predictive data mining in clinical medicine: current issues and guidelines. International journal of medical informatics, 2008. 77 (2): p. 81-97.
[10] Alić, B., L. Gurbeta, and A. Badnjević. Machine learning techniques for classification of diabetes and cardiovascular diseases. in 2017 6th Mediterranean Conference on Embedded Computing (MECO). 2017. IEEE.
[11] Li, Y., et al. Early prediction of acute kidney injury in critical care setting using clinical notes. in 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2018. IEEE.
[12] Berenson, R.A., Pronovost, P.J. and Krumholz, H.M., 2013. Achieving the potential of health care performance measures. Timely Anal Immed Health Pol, (2013), p. 2.
[13] Sundararaman, A., S. V. Ramanathan, and R. Thati, Novel approach to predict hospital readmissions using feature selection from unstructured data with class imbalance. Big data research, 2018. 13: p. 65-75.
[14] Bardell, T., et al., ICU readmission after cardiac surgery. European journal of cardio-thoracic surgery, 2003. 23 (3): p. 354-359.
[15] Vincent, J.-L., K. Donadello, and X. Schmit, Biomarkers in the critically ill patient: C-reactive protein. Critical care clinics, 2011. 27 (2): p. 241-251.
[16] Willett, P., The Porter stemming algorithm: then and now. Program, 2006. 40 (3): p. 219-223.
[17] Huang, Y.-F. and C.-H. Hsu, PubMed smarter: Query expansion with implicit words based on gene ontology. Knowledge-Based Systems, 2008. 21 (8): p. 927-933.
[18] Aronson, A. R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. in Proceedings of the AMIA Symposium. 2001. American Medical Informatics Association.
[19] Cotton, R. T. and C. M. Myer III, Contemporary surgical management of laryngeal stenosis in children. American journal of otolaryngology, 1984. 5 (5): p. 360-368.
[20] Maitland, C. G., Perilymphatic fistula. Current neurology and neuroscience reports, 2001. 1 (5): p. 486-491.
[21] Frederickson, R. G., The subdural space interpreted as a cellular layer of meninges. The Anatomical Record, 1991. 230 (1): p. 38-51.
[22] Zhang, Y., et al., Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge. Journal of biomedical informatics, 2017. 75: p. S129-S137.
Cite This Article
  • APA Style

    Zhiheng Li, Xinyue Xing, Bingzhang Lu, Ying Zhao, Zhixiang Li. (2019). Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning. Biomedical Statistics and Informatics, 4(3), 22-26. https://doi.org/10.11648/j.bsi.20190403.11

    Copy | Download

    ACS Style

    Zhiheng Li; Xinyue Xing; Bingzhang Lu; Ying Zhao; Zhixiang Li. Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning. Biomed. Stat. Inform. 2019, 4(3), 22-26. doi: 10.11648/j.bsi.20190403.11

    Copy | Download

    AMA Style

    Zhiheng Li, Xinyue Xing, Bingzhang Lu, Ying Zhao, Zhixiang Li. Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning. Biomed Stat Inform. 2019;4(3):22-26. doi: 10.11648/j.bsi.20190403.11

    Copy | Download

  • @article{10.11648/j.bsi.20190403.11,
      author = {Zhiheng Li and Xinyue Xing and Bingzhang Lu and Ying Zhao and Zhixiang Li},
      title = {Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning},
      journal = {Biomedical Statistics and Informatics},
      volume = {4},
      number = {3},
      pages = {22-26},
      doi = {10.11648/j.bsi.20190403.11},
      url = {https://doi.org/10.11648/j.bsi.20190403.11},
      eprint = {https://download.sciencepg.com/pdf/10.11648.j.bsi.20190403.11},
      abstract = {ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU readmission can help prevent patients from worse situation and lead to lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning techniques manipulating on healthcare large scale data. To this end, we designed data-driven predictive models to estimate the risk of Intensive Care Unit (ICU) readmission. The discharge summary of each hospital admission was carefully represented by natural language processing algorithms. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers including naïve Bayes, support vector machine, logistic regression, gradient boosting decision tree and 2 feature representations including Bag-of-Words and Bag-of-CUIs were adopted to construct predictive configurations. The best configuration yielded a competitive AUC of 0.748. High contribution words and medical terms were further investigated to ensure that they were clinical meaningful. A comparative study between two feature representations were also discussed. Our work suggests that natural language processing of discharge summaries is capable to extract meaningful information from discharge summary automatically and to send clinicians the warning of unplanned 30-day readmission upon discharge.},
     year = {2019}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Early Prediction of 30-Day ICU Re-admissions Using Natural Language Processing and Machine Learning
    AU  - Zhiheng Li
    AU  - Xinyue Xing
    AU  - Bingzhang Lu
    AU  - Ying Zhao
    AU  - Zhixiang Li
    Y1  - 2019/11/22
    PY  - 2019
    N1  - https://doi.org/10.11648/j.bsi.20190403.11
    DO  - 10.11648/j.bsi.20190403.11
    T2  - Biomedical Statistics and Informatics
    JF  - Biomedical Statistics and Informatics
    JO  - Biomedical Statistics and Informatics
    SP  - 22
    EP  - 26
    PB  - Science Publishing Group
    SN  - 2578-8728
    UR  - https://doi.org/10.11648/j.bsi.20190403.11
    AB  - ICU readmission is associated with longer hospitalization, mortality and adverse outcomes. An early recognition of ICU readmission can help prevent patients from worse situation and lead to lower treatment cost. As the abundance of Electronics Health Records (EHR), it is popular to design clinical decision tools with machine learning techniques manipulating on healthcare large scale data. To this end, we designed data-driven predictive models to estimate the risk of Intensive Care Unit (ICU) readmission. The discharge summary of each hospital admission was carefully represented by natural language processing algorithms. Unified Medical Language System (UMLS) was further used to standardize inconsistency of discharge summaries. 5 machine learning classifiers including naïve Bayes, support vector machine, logistic regression, gradient boosting decision tree and 2 feature representations including Bag-of-Words and Bag-of-CUIs were adopted to construct predictive configurations. The best configuration yielded a competitive AUC of 0.748. High contribution words and medical terms were further investigated to ensure that they were clinical meaningful. A comparative study between two feature representations were also discussed. Our work suggests that natural language processing of discharge summaries is capable to extract meaningful information from discharge summary automatically and to send clinicians the warning of unplanned 30-day readmission upon discharge.
    VL  - 4
    IS  - 3
    ER  - 

    Copy | Download

Author Information
  • High School Division, Northeast Yucai Foreign Language School, Shenyang, China

  • High School Division, Northeast Yucai Foreign Language School, Shenyang, China

  • Junior High Division, Northeast Yucai School, Shenyang, China

  • Department of Engineering Science and Applied Math, Northwestern University, Evanston, USA

  • Department of Biomedical Engineering, Shenyang Pharmaceutical University, Shenyang, China

  • Section