Skip to main navigation Skip to search Skip to main content

Threshold-Learned CNN for Multi-Label Text Classification of Electronic Health Records

  • Tampere University

Research output: Contribution to journalArticlepeer-review

Abstract

Text data in the form of natural language is a valuable resource that contains domain-specific information applicable to various applications. An example are electronic health records (eHR) offering comprehensive insights into patients' health histories, enabling knowledge extraction for clinical diagnosis and treatment. In this paper, we study multi-label text classification (MLTC) of eHR data by introducing two novel MLTC methods based on a threshold-learned convolutional neural network (CNN). We conduct comprehensive comparisons with other multi-label models and binary relevance (BR). Importantly, we do not only optimize the architecture of multi-label classifiers but also of the baseline BR model. As a result, our findings indicate that the adaptive-threshold CNN (AT-CNN) and implicit-threshold CNN (IT-CNN) provide a favorable approximation of a binary CNN (B-CNN) with the added benefit of improved runtime efficiency. The latter is crucial when the number of classes grows larger because the runtime of classifiers based on one-vs-rest mappings becomes increasingly prohibitive for such configurations.

Original languageEnglish
Pages (from-to)93402-93419
Number of pages18
JournalIEEE Access
Volume11
DOIs
Publication statusPublished - 2023
Externally publishedYes

Keywords

  • Data science
  • Deep learning
  • Multi-label classification
  • Natural language processing

Fingerprint

Dive into the research topics of 'Threshold-Learned CNN for Multi-Label Text Classification of Electronic Health Records'. Together they form a unique fingerprint.

Cite this