Personal profile
Biography
Hamdy Mubarak holds a BSc. in Computer Science from the Faculty of Engineering, Alexandria University in Egypt (Distinct with a degree of honor) in 1992. He was the Arabic NLP R&D manager at Sakhr Software until 2013 working on morphological, syntactic, and semantic disambiguation of Arabic and building commercial NLP applications (for WHO (Canada), governments, and major banks in the MENA region). He led the team that built the first morphological analyzer, diacritizer, spelling and grammar checker, and information extraction for Arabic documents in the 90s. He also participated in Rule-based MT, OCR, TTS, and Search Engine projects and all of these projects are multi-award-winners in the region and globally (NIST USA and Canada, Gitex UAE, KUWAIT, etc.)
He joined QCRI in 2014 (currently is a Principal Software Engineer), and he participated in building state-of-the-art tools for processing the standard, classical, and dialectal varieties of Arabic (farasa.qcri.org), QATS Speech Transcription and Translation (qats.qcri.org and st.qcri.org), IYAS Question Answering, and Fake News Detection (tanbih.org) projects in addition to leading the efforts in offensive language and hate speech detection, spam, adult content detection, etc. (asad.qcri.org). He participated in building the Arabic Large Language Model (Fanar.qa) which achieved best results of language capabilities and culture alignment on standard benchmarks. He also participated in building the speech recognition system for Modern standard Arabic (MSA) and its dialects (Kanari.ai), and converting text to speech for MSA and dialects.
He co-authored ~140 papers since he joined QCRI in Computational Linguistics conferences (ACL, NAACL, EMNLP, SemEval, CONLL, EACL, LREC, etc.), Speech conferences (IEEE SLT, ASRU, IWSLT), journals (IPM, NLE), and social computing (ICWSM, SocInfo, WebSci). He co-published some books and has a patent under his name about using machine translation technologies to build state-of-the-art Arabic diacritizer.
He worked also as a Software Engineering Manager at Ellipsis Digital Systems, a US-based company, in the field of digital communications between 2001 and 2003.
Education
Hamdy Mubarak holds a BSc. in Computer Science from the Faculty of Engineering, Alexandria University in Egypt (Distinct with a degree of honor) in 1992.
Fingerprint
- 1 Similar Profiles
Collaborations and top research areas from the last five years
-
AraSafe: Benchmarking Safety in Arabic LLMs
Mubarak, H., Mohamed, A. & Hawasly, M., 2025, EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025. Christodoulopoulos, C., Chakraborty, T., Rose, C. & Peng, V. (eds.). Association for Computational Linguistics (ACL), p. 9976-9992 17 p. (EMNLP 2025 - 2025 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2025).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
-
Who should set the Standards? Analysing Censored Arabic Content on Facebook during the Palestine-Israel Conflict
Magdy, W., Mubarak, H. & Salminen, J., 26 Apr 2025, CHI 2025 - Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, 178. (Conference on Human Factors in Computing Systems - Proceedings).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open Access -
Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic
El Kheir, Y., Mubarak, H., Ali, A. & Chowdhury, S. A., 5 Aug 2024, Long Papers. Ku, L.-W., Martins, A. F. T. & Srikumar, V. (eds.). Association for Computational Linguistics (ACL), p. 13172-13184 13 p. (Proceedings of the Annual Meeting of the Association for Computational Linguistics; vol. 1).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open Access2 Link opens in a new tab Citations (Scopus) -
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification
Fadeeva, E., Rubashevskii, A., Shelmanov, A., Petrakov, S., Li, H., Mubarak, H., Tsymbalov, E., Kuzmin, G., Panchenko, A., Baldwin, T., Nakov, P. & Panov, M., 6 Jun 2024, The 62nd Annual Meeting of the Association for Computational Linguistics: Findings of the Association for Computational Linguistics, ACL 2024. Ku, L.-W., Martins, A. & Srikumar, V. (eds.). Association for Computational Linguistics (ACL), p. 9367-9385 19 p. (Proceedings of the Annual Meeting of the Association for Computational Linguistics).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
Open Access20 Link opens in a new tab Citations (Scopus) -
Halwasa: Quantify and Analyze Hallucinations in Large Language Models: Arabic as a Case Study
Mubarak, H., Al-Khalifa, H. & Alkhalefah, K., May 2024, 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings. Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S. & Xue, N. (eds.). European Language Resources Association (ELRA), p. 8008-8015 8 p. (2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
3 Link opens in a new tab Citations (Scopus)