IMMUNOGENOMIC LANDSCAPE OF THE QATARI POPULATION

  • Zainab Jan

Student thesis: Doctoral Dissertation

Abstract

Human leukocyte antigens (HLA) play a central role in immune response and adverse drug reactions (ADRs). Although numerous studies have investigated HLA associated genetic variation across diverse global populations, there remains a significant gap in research focusing specifically on the Qatari population. This study aims to explore the genetic variation within the HLA region of the Qatari population to better understand its role in complex traits, drug response variability, and immune related mechanisms. We used the N-1 algorithm, developed by the QGP HLA Consortium, which aggregates outputs from HISAT-Genotype, HLA-HD, and T1K. This algorithm was used to call HLA alleles from whole genome sequencing (WGS) data, aligned to the IMGT/HLA v3.49 reference, to ensure high-quality and concordant HLA allele calls. Moreover, we performed a region-wide association analysis (RWAS) using Regenie, where the association of HLA variants with binary and quantitative traits was assessed using logistic and linear regression models, respectively. For the association analysis, we analyzed 165 phenotypic traits including 51 binary and 114 quantitative traits using a total of 14,387 whole-genome sequences. Overall, we identified that 25 traits were significantly (p < 5×10-8) associated with variants in the HLA region. Out of these, 22 were novel including the association of HLA-A*68:01:01 with mean cell hemoglobin, HLA-DRB5*02:02:01 with glycated hemoglobin (HbA1C), and HLA-B*39 with Prothrombin. By conducting a comprehensive analysis of the entire HLA region, we identified distinct clusters of genetic correlations between different traits. Building on these findings, we performed an in-depth pharmacogenomic analysis to uncover associations between HLA variants and drug response and identified the higher frequency of some important alleles with high level of evidence in PharmGKB such as HLA-B*58:01 (3.08%) and HLA-C*06:02 (20.63%). Lastly, we developed a machine learning-based predictive model, VoteHLA, to identify MHC binding peptides using peptide sequence-based features. It outperformed the existing models and improved the prediction by 6.4%, 5.83%, 8.87%, 13.12%, and 3.9% in accuracy, specificity, sensitivity, Matthews Correlation Coefficient (MCC), and area under the curve of ROC (AUC), respectively, when compared to NetMHCIpan4.0. Our integrated approach offers novel insights into the immunogenomic landscape of the Qatari population and provides a machine learning method for predicting HLA binding peptides with potential applications in precision medicine and immunotherapy.
Date of Award2025
Original languageAmerican English
Awarding Institution
  • HBKU College of Health & Life Sciences

Keywords

  • HLA Imputation
  • Human Leukocycte Antigen
  • Pharmacogenomics
  • Phenotypes
  • Precision Medicine
  • Region wide association analysis

Cite this

'