Are Large Language Models the New Interface for Data Pipelines?

  • Sylvio Barbon*
  • , Paolo Ceravolo
  • , Sven Groppe
  • , Mustafa Jarrar
  • , Samira Maghool
  • , Florence Sèdes
  • , Soror Sahri
  • , Maurice Van Keulen
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Citations (Scopus)

Abstract

A Language Model is a term that encompasses various types of models designed to understand and generate human communication. Large Language Models (LLMs) have gained significant attention due to their ability to process text with human-like fluency and coherence, making them valuable for a wide range of data-related tasks fashioned as pipelines. The capabilities of LLMs in natural language understanding and generation, combined with their scalability, versatility, and state-of-the-art performance, enable innovative applications across various AI-related fields, including eXplainable Artificial Intelligence (XAI), Automated Machine Learning (AutoML), and Knowledge Graphs (KG). Furthermore, we believe these models can extract valuable insights and make data-driven decisions at scale, a practice commonly referred to as Big Data Analytics (BDA). In this position paper, we provide some discussions in the direction of unlocking synergies among these technologies, which can lead to more powerful and intelligent AI solutions, driving improvements in data pipelines across a wide range of applications and domains integrating humans, computers, and knowledge.

Original languageEnglish
Title of host publicationProceedings of the International Workshop on Big Data in Emergent Distributed Environments, BIDEDE 2024, in conjunction with the 2024 ACM SIGMOD/PODS Conference
EditorsPhilippe Cudre-Mauroux, Andrea Ko, Robert Wrembel
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400706790
DOIs
Publication statusPublished - 9 Jun 2024
Externally publishedYes
Event2024 International Workshop on Big Data in Emergent Distributed Environments, BIDEDE 2024, in conjunction with the 2024 ACM SIGMOD/PODS Conference - Santiago, Chile
Duration: 9 Jun 20249 Jun 2024

Publication series

NameProceedings of the International Workshop on Big Data in Emergent Distributed Environments, BIDEDE 2024, in conjunction with the 2024 ACM SIGMOD/PODS Conference

Conference

Conference2024 International Workshop on Big Data in Emergent Distributed Environments, BIDEDE 2024, in conjunction with the 2024 ACM SIGMOD/PODS Conference
Country/TerritoryChile
CitySantiago
Period9/06/249/06/24

Keywords

  • Automated Machine Learning
  • Big Data Analytic
  • Human-Computer Interaction
  • Knowledge Graphs
  • Natural Language Understanding
  • eXplainable Artificial Intelligence

Fingerprint

Dive into the research topics of 'Are Large Language Models the New Interface for Data Pipelines?'. Together they form a unique fingerprint.

Cite this