Abstract
Text data in the form of natural language is a valuable resource that contains domain-specific information applicable to various applications. An example are electronic health records (eHR) offering comprehensive insights into patients' health histories, enabling knowledge extraction for clinical diagnosis and treatment. In this paper, we study multi-label text classification (MLTC) of eHR data by introducing two novel MLTC methods based on a threshold-learned convolutional neural network (CNN). We conduct comprehensive comparisons with other multi-label models and binary relevance (BR). Importantly, we do not only optimize the architecture of multi-label classifiers but also of the baseline BR model. As a result, our findings indicate that the adaptive-threshold CNN (AT-CNN) and implicit-threshold CNN (IT-CNN) provide a favorable approximation of a binary CNN (B-CNN) with the added benefit of improved runtime efficiency. The latter is crucial when the number of classes grows larger because the runtime of classifiers based on one-vs-rest mappings becomes increasingly prohibitive for such configurations.
| Original language | English |
|---|---|
| Pages (from-to) | 93402-93419 |
| Number of pages | 18 |
| Journal | IEEE Access |
| Volume | 11 |
| DOIs | |
| Publication status | Published - 2023 |
| Externally published | Yes |
Keywords
- Data science
- Deep learning
- Multi-label classification
- Natural language processing
Fingerprint
Dive into the research topics of 'Threshold-Learned CNN for Multi-Label Text Classification of Electronic Health Records'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver