Skip to main navigation Skip to search Skip to main content

FormaT5: Abstention and Examples for Conditional Table Formatting with Natural Language

  • Mukul Singh
  • , José Cambronero
  • , Sumit Gulwani
  • , Vu Le
  • , Carina Negreanu
  • , Elnaz Nouri
  • , Mohammad Raza*
  • , Gust Verbruggen
  • *Corresponding author for this work
  • Microsoft USA
  • Microsoft New Haven

Research output: Contribution to journalArticlepeer-review

Abstract

Formatting is an important property in tables for visualization, presentation, and analysis. Spreadsheet software allows users to automatically format their tables by writing data-dependent conditional formatting (CF) rules. Writing such rules is often challenging for users as it requires understanding and implementing the underlying logic. We present FoRmAT5, a transformer-based model that can generate a CF rule given the target table and a natural language description of the desired formatting logic. We find that user descriptions for these tasks are often under-specified or ambiguous, making it harder for code generation systems to accurately learn the desired rule in a single step. To tackle this problem of under-specification and minimise argument errors, FoRmAT5 learns to predict placeholders though an abstention objective. These placeholders can then be filled by a second model or, when examples of rows that should be formatted are available, by a programming-byexample system. To evaluate FoRmAT5 on diverse and real scenarios, we create an extensive benchmark of 1053 CF tasks, containing realworld descriptions collected from four different sources. We release our benchmarks to encourage research in this area. Abstention and filling allow FoRmAT5 to outperform 8 different neural approaches on our benchmarks, both with and without examples. Our results illustrate the value of building domain-specific learning systems.

Original languageEnglish
Pages (from-to)497-510
Number of pages14
JournalProceedings of the VLDB Endowment
Volume17
Issue number3
DOIs
Publication statusPublished - Nov 2023
Externally publishedYes
Event50th International Conference on Very Large Data Bases, VLDB 2024 - Guangzhou, China
Duration: 24 Aug 202429 Aug 2024

Fingerprint

Dive into the research topics of 'FormaT5: Abstention and Examples for Conditional Table Formatting with Natural Language'. Together they form a unique fingerprint.

Cite this