Exploring the capabilities of large language models in oral and maxillofacial surgery

Sulaiman Khan*, Shahira Padinharepattel Mohamed, Md Rafiul Biswas, Zubair Shah

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Oral and Maxillofacial Surgery (OMFS) is a surgical spatiality that serves as a bridge between medicine and dentistry, focusing on the diagnosis and treatment of diseases affecting the mouth, jaw, face, and neck. Large Language Models (LLMs), which first appeared in 2019, are trained in extensive text collections and can process languages with high quality. Although OMFS is a hands-on surgical specialty, LLMs have been increasingly used for patient education, research, and training purposes. This study aimed to explore the capabilities of LLMs in the field of OMFS by investigating the most recent literature. Seven peer-reviewed online repositories including PubMed, Scopus, association for computing machinery (ACM), IEEE, Embase, cumulative index to nursing and allied health literature (CINAHL), and Google Scholar, are selected to download relevant articles. Adhering to the PRISMA-ScR guidelines, we conducted a systematic search across these libraries to select articles that incorporated LLMs into OMFS. The forward and backward reference lists of the included articles were checked to retrieve missing articles. After the final screening process a total of 20 studies are selected for this review process. The selected studies demonstrated the applications of LLMs in OMFS, such as patient education, clinical decision support, and procedural guidance for specific procedures. The study results showed variability in LLM response accuracy and lower accuracy in citation generation, whereas open-ended questions achieved higher accuracy rates. Advanced versions of LLMs, such as ChatGPT4, have shown improved accuracy, and reliability compared with older GPT versions. While some studies reported that LLM responses lacked complete details and exhibited only moderate accuracy. This variability in performance emphasizes the need for the continuous refinement of LLMs and highlights the importance of human oversight in clinical applications. However, there is a need for further refinement, extensive research, and verification by experts.

Original languageEnglish
Article number00202940251344491
JournalMeasurement and Control (United Kingdom)
Early online dateJun 2025
DOIs
Publication statusPublished - 26 Jun 2025

Keywords

  • And neck surgery
  • Bard
  • ChatGPT
  • Head
  • Large language model
  • Llm
  • Maxillofacial surgery
  • Oral surgery

Fingerprint

Dive into the research topics of 'Exploring the capabilities of large language models in oral and maxillofacial surgery'. Together they form a unique fingerprint.

Cite this