Skip to main navigation Skip to search Skip to main content

Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients

  • Johannes Smolander
  • , Alexey Stupnikov
  • , Galina Glazko
  • , Matthias Dehmer
  • , Frank Emmert-Streib*
  • *Corresponding author for this work
  • Tampere University
  • University of Turku
  • Johns Hopkins University
  • University of Arkansas for Medical Sciences
  • Upper Austria University of Applied Sciences
  • Private University for Health Sciences, Medical Informatics and Technology
  • Nankai University
  • Institute of Biosciences and Medical Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Deciphering the meaning of the human DNA is an outstanding goal which would revolutionize medicine and our way for treating diseases. In recent years, non-coding RNAs have attracted much attention and shown to be functional in part. Yet the importance of these RNAs especially for higher biological functions remains under investigation. Methods: In this paper, we analyze RNA-seq data, including non-coding and protein coding RNAs, from lung adenocarcinoma patients, a histologic subtype of non-small-cell lung cancer, with deep learning neural networks and other state-of-the-art classification methods. The purpose of our paper is three-fold. First, we compare the classification performance of different versions of deep belief networks with SVMs, decision trees and random forests. Second, we compare the classification capabilities of protein coding and non-coding RNAs. Third, we study the influence of feature selection on the classification performance. Results: As a result, we find that deep belief networks perform at least competitively to other state-of-the-art classifiers. Second, data from non-coding RNAs perform better than coding RNAs across a number of different classification methods. This demonstrates the equivalence of predictive information as captured by non-coding RNAs compared to protein coding RNAs, conventionally used in computational diagnostics tasks. Third, we find that feature selection has in general a negative effect on the classification performance which means that unfiltered data with all features give the best classification results. Conclusions: Our study is the first to use ncRNAs beyond miRNAs for the computational classification of cancer and for performing a direct comparison of the classification capabilities of protein coding RNAs and non-coding RNAs.

Original languageEnglish
Article number1176
JournalBMC Cancer
Volume19
Issue number1
DOIs
Publication statusPublished - 3 Dec 2019
Externally publishedYes

Keywords

  • Classification
  • Deep belief network
  • Deep learning
  • Lung cancer and Machine learning
  • Non-coding RNA

Fingerprint

Dive into the research topics of 'Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients'. Together they form a unique fingerprint.

Cite this