UMMS Affiliation

Department of Medicine, Division of Cardiovascular Medicine

Publication Date

2021-07-02

Document Type

Article

Disciplines

Artificial Intelligence and Robotics | Data Science | Health Information Technology

Abstract

BACKGROUND: Accurate detection of bleeding events from electronic health records (EHRs) is crucial for identifying and characterizing different common and serious medical problems. To extract such information from EHRs, it is essential to identify the relations between bleeding events and related clinical entities (eg, bleeding anatomic sites and lab tests). With the advent of natural language processing (NLP) and deep learning (DL)-based techniques, many studies have focused on their applicability for various clinical applications. However, no prior work has utilized DL to extract relations between bleeding events and relevant entities.

OBJECTIVE: In this study, we aimed to evaluate multiple DL systems on a novel EHR data set for bleeding event-related relation classification.

METHODS: We first expert annotated a new data set of 1046 deidentified EHR notes for bleeding events and their attributes. On this data set, we evaluated three state-of-the-art DL architectures for the bleeding event relation classification task, namely, convolutional neural network (CNN), attention-guided graph convolutional network (AGGCN), and Bidirectional Encoder Representations from Transformers (BERT). We used three BERT-based models, namely, BERT pretrained on biomedical data (BioBERT), BioBERT pretrained on clinical text (Bio+Clinical BERT), and BioBERT pretrained on EHR notes (EhrBERT).

RESULTS: Our experiments showed that the BERT-based models significantly outperformed the CNN and AGGCN models. Specifically, BioBERT achieved a macro F1 score of 0.842, outperforming both the AGGCN (macro F1 score, 0.828) and CNN models (macro F1 score, 0.763) by 1.4% (P < .001) and 7.9% (P < .001), respectively.

CONCLUSIONS: In this comprehensive study, we explored and compared different DL systems to classify relations between bleeding events and other medical concepts. On our corpus, BERT-based models outperformed other DL models for identifying the relations of bleeding-related entities. In addition to pretrained contextualized word representation, BERT-based models benefited from the use of target entity representation over traditional sequence representation.

Keywords

BERT, CNN, GCN, bleeding, electronic health records, relation classification

Rights and Permissions

Copyright ©Avijit Mitra, Bhanu Pratap Singh Rawat, David D McManus, Hong Yu. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 02.07.2021. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.

DOI of Published Version

10.2196/27527

Source

Mitra A, Rawat BPS, McManus DD, Yu H. Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study. JMIR Med Inform. 2021 Jul 2;9(7):e27527. doi: 10.2196/27527. PMID: 34255697; PMCID: PMC8285744. Link to article on publisher's site

Journal/Book/Conference Title

JMIR medical informatics

Related Resources

Link to Article in PubMed

PubMed ID

34255697

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS