Human leukocyte antigens (HLA) play a central role in immune response and adverse
drug reactions (ADRs). Although numerous studies have investigated HLA associated
genetic variation across diverse global populations, there remains a significant gap in
research focusing specifically on the Qatari population. This study aims to explore the
genetic variation within the HLA region of the Qatari population to better understand
its role in complex traits, drug response variability, and immune related mechanisms.
We used the N-1 algorithm, developed by the QGP HLA Consortium, which aggregates
outputs from HISAT-Genotype, HLA-HD, and T1K. This algorithm was used to call
HLA alleles from whole genome sequencing (WGS) data, aligned to the IMGT/HLA
v3.49 reference, to ensure high-quality and concordant HLA allele calls. Moreover, we
performed a region-wide association analysis (RWAS) using Regenie, where the
association of HLA variants with binary and quantitative traits was assessed using
logistic and linear regression models, respectively. For the association analysis, we
analyzed 165 phenotypic traits including 51 binary and 114 quantitative traits using a
total of 14,387 whole-genome sequences. Overall, we identified that 25 traits were
significantly (p < 5×10-8) associated with variants in the HLA region. Out of these, 22
were novel including the association of HLA-A*68:01:01 with mean cell hemoglobin,
HLA-DRB5*02:02:01 with glycated hemoglobin (HbA1C), and HLA-B*39 with
Prothrombin. By conducting a comprehensive analysis of the entire HLA region, we
identified distinct clusters of genetic correlations between different traits. Building on
these findings, we performed an in-depth pharmacogenomic analysis to uncover
associations between HLA variants and drug response and identified the higher
frequency of some important alleles with high level of evidence in PharmGKB such as
HLA-B*58:01 (3.08%) and HLA-C*06:02 (20.63%). Lastly, we developed a machine
learning-based predictive model, VoteHLA, to identify MHC binding peptides using
peptide sequence-based features. It outperformed the existing models and improved the
prediction by 6.4%, 5.83%, 8.87%, 13.12%, and 3.9% in accuracy, specificity,
sensitivity, Matthews Correlation Coefficient (MCC), and area under the curve of ROC
(AUC), respectively, when compared to NetMHCIpan4.0. Our integrated approach
offers novel insights into the immunogenomic landscape of the Qatari population and
provides a machine learning method for predicting HLA binding peptides with potential
applications in precision medicine and immunotherapy.
| Date of Award | 2025 |
|---|
| Original language | American English |
|---|
| Awarding Institution | - HBKU College of Health & Life Sciences
|
|---|
- HLA Imputation
- Human Leukocycte Antigen
- Pharmacogenomics
- Phenotypes
- Precision Medicine
- Region wide association analysis
IMMUNOGENOMIC LANDSCAPE OF THE QATARI POPULATION
Jan, Z. (Author). 2025
Student thesis: Doctoral Dissertation