Skip to main navigation Skip to search Skip to main content

Efficient and interpretable DNA/RNA representation using Komlós–Hadamard transforms

  • Texas A&M University
  • University of Illinois at Urbana-Champaign
  • Weill Cornell Medicine-Qatar

Research output: Contribution to journalArticlepeer-review

Abstract

This study introduces a novel encoding scheme for DNA/RNA sequences, integrating Komlós and Hadamard transforms. Unlike traditional One-Hot encoding, this approach offers a more informative representation of omics data while significantly reducing computational complexity. However, it is important to note that the Komlós transform component provides fewer features and does not utilize sparse codes. By leveraging the inherent properties of these transforms, our method effectively captures complex patterns within the data, leading to improved model accuracy and reduced training times. When combined with an image transformation, this encoding scheme demonstrates particularly efficient results, achieving superior performance across various predictive tasks with significantly lower computational resource demands compared to One-Hot encoding. Our findings suggest that this novel encoding scheme, particularly when integrated with Hilbert Curve mapping or sequence to image analysis, holds significant promise for advancing DNA/RNA data analysis by offering a more efficient and effective approach to feature representation.

Original languageEnglish
Article number108
JournalBMC Bioinformatics
Volume27
Issue number1
Early online dateApr 2026
DOIs
Publication statusE-pub ahead of print - Apr 2026

Keywords

  • Classification
  • DNA
  • Enhancers
  • FFT
  • Komlós conjecture
  • Machine learning
  • Omics
  • One-Hot encoding

Fingerprint

Dive into the research topics of 'Efficient and interpretable DNA/RNA representation using Komlós–Hadamard transforms'. Together they form a unique fingerprint.

Cite this