PhD in Computer Vision · MBZUAI

Sahal Shaji
Mullappilly

Advancing multilingual multimodal medical foundation models for accessible, clinically grounded, and generalizable healthcare AI.

View publications

Download CV

Google Scholar

Open-source impact

Citations

GitHub stars

~0M

Model & dataset downloads

0 + 2

US patents granted / pending

Research Publications

EMNLP 2025 — FindingsMeta Llama Impact AwardLinkedIn feature by Yann LeCun

BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities

A bilingual Arabic–English medical multimodal model that unifies text and image understanding for diverse clinical tasks — multi-turn conversation, VQA, report generation and summarisation. Trained on 1.6M bilingual healthcare samples, it sets state of the art on both medical language and multimodal benchmarks.

Sahal Shaji Mullappilly^*, Mohammed Irfan Kurpath^*, Sara Pieri, Saeed Yahya Alseiari, Shanavas Cholakkal, Khaled Aldahmani, Fahad Khan, Rao Anwer, Salman Khan, Timothy Baldwin, Hisham Cholakkal

* equal contribution

0.0K+

Hugging Face downloads

Paper Meta blog LinkedIn Feature

CVPR 2024Hugging Face featured

GLaMM: Pixel Grounding Large Multimodal Model

A multimodal model that generates natural-language responses grounded with pixel-level segmentation masks, enabling far more precise visual grounding than standard chat models. Introduces a large grounding dataset and evaluation for grounded conversation, with strong results on segmentation, captioning and conversational grounding.

Hanoona Rasheed^*, Muhammad Maaz^*, Sahal Shaji Mullappilly, Abdelrahman Shaker, Salman Khan, Hisham Cholakkal, Rao M Anwer, Eric Xing, Ming-Hsuan Yang, Fahad S Khan

* equal contribution

0K+

Hugging Face downloads

arXiv HF paper GitHub

arXiv 2026

MediX-R1: Open-Ended Medical Reinforcement Learning

An open-ended reinforcement learning framework for medical multimodal models, moving beyond brittle multiple-choice training toward clinically grounded free-form answers. Composite reward design and an LLM-as-judge evaluation framework improve open-ended medical reasoning across both text and image-text benchmarks using only ~51K instruction samples.

Sahal Shaji Mullappilly^*, Mohammed Irfan Kurpath^*, Omair Mohamed, Mohamed Zidan, Fahad Khan, Salman Khan, Rao Anwer, Hisham Cholakkal

* equal contribution

0.0K+

Hugging Face downloads

arXiv Project GitHub Coverage

Research Highlights

01
First / co-author publications at leading AI venues including EMNLP, ACL, AAAI, CVPR, and NeurIPS, with contributions across multilingual medical LLMs, multimodal medical AI, open-ended reasoning, visual grounding, and general-purpose multimodal systems.
02
Meta Llama Impact Innovation Award for BiMediX2, an Arabic-English medical multimodal model covering diverse clinical image modalities. This work was accepted to EMNLP 2025 Findings and was featured on LinkedIn by Yann LeCun.
03
Recipient of the MBZUAI CV Department Emerging Impact Award 2026.
04
Co-author on MAviS, recognised with the EMNLP 2025 Senior Area Chair Highlight Award.
05
Research supported by the NVIDIA Academic Grant and the MBZUAI–IIT Delhi Research Grant, with related work showcased at GITEX 2024, Machines Can See 2025, and the AI4SD side event of the 79th UN General Assembly.
06
Strong academic and open-source impact, with 1,000+ citations, 2,000+ GitHub stars, and nearly 1M downloads of publicly released models and datasets.
07
Recipient of a granted US Patent for open-world semi-supervised object detection, with additional U.S. patent filings in bilingual medical LLMs and pixel-grounded multimodal models.
08
JAIS Climate, showcased at COP28, extended earlier work on Arabic Mini-ClimateGPT, published at EMNLP 2023 Findings.

Patents

US Patents

One granted US patent and two pending filings spanning open-world detection, bilingual medical LLMs, and pixel-grounded multimodal models.

Professional Experience

Where I've worked

AI Architect
Aspire Zone · Intaleq
Jun 2025
Machine Learning Intern
Microsoft, Dubai Internet City
May 2019Letter →
Machine Learning Engineer
General Department of Artificial Intelligence, Dubai Police HQ
Sep 2022

Education

Aug 2023 — Present
PhD in Computer Vision
Mohamed bin Zayed University of Artificial Intelligence
· UAE Golden Visa
Sep 2022
MSc in Computer Vision
Mohamed bin Zayed University of Artificial Intelligence
GPA 3.84 / 4.0· UAE Golden Visa
May 2021
BTech in Computer Science Engineering
National Institute of Technology, Puducherry
GPA 9.04 / 10· First Class with Distinction
Mar 2017
Senior Secondary (Grade 12)
Indian High School, Dubai
94.3%

Projects

Things I've built.

Side projects from undergrad and beyond — shipped products, hardware experiments, and the occasional drone.

DriveSafe

Real-time driving threat detection across mobile and watch.

Cross-platform Flutter app — Android, iOS and watchOS.
Continuously monitors road conditions and surfaces threats to the driver.
Quantised SSD MobileNet v2 for object detection.
Metal GPU delegate + Neural Processing cores — ~30 ms average inference.

Watch demo

DocApp

Zero-cost clinic digitalisation system on smartphones.

Cross-platform Flutter app with Firestore backend.
Appointment booking and live token streaming to client devices.
Patient database with an analytics dashboard for the clinic.

Interdisciplinary

Arduino-based plant hydration systemElectric Bike (college Techfest)Mangoo — food delivery appPackage delivery drone (IIT Bombay e-Yantra)

Get in touch

Let's talk research, collaboration, or interesting ideas.

Academic email

sahal.mullappilly@mbzuai.ac.ae

Personal email

sahalshajim@gmail.com

linkedin.com/in/sahalshajim

Google Scholar

scholar.google.com

Abu Dhabi, UAE·+971 56 914 9169 +91 8606 350 169

Sahal Shaji
Mullappilly

Featured Posts

Yann LeCun featuring BiMediX2

Research Publications

BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities

GLaMM: Pixel Grounding Large Multimodal Model

MediX-R1: Open-Ended Medical Reinforcement Learning

Research Highlights

US Patents

System and method of open-world semi-supervised satellite object detection

Bilingual medical mixture of experts large language model

System and method of pixel grounding large multimodal model

Where I've worked

Education

PhD in Computer Vision

MSc in Computer Vision

BTech in Computer Science Engineering

Senior Secondary (Grade 12)

Things I've built.

DriveSafe

DocApp

Let's talk research, collaboration, or interesting ideas.

Sahal ShajiMullappilly

Featured Posts

Yann LeCun featuring BiMediX2

Research Publications

BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities

GLaMM: Pixel Grounding Large Multimodal Model

MediX-R1: Open-Ended Medical Reinforcement Learning

Research Highlights

US Patents

System and method of open-world semi-supervised satellite object detection

Bilingual medical mixture of experts large language model

System and method of pixel grounding large multimodal model

Where I've worked

Education

PhD in Computer Vision

MSc in Computer Vision

BTech in Computer Science Engineering

Senior Secondary (Grade 12)

Things I've built.

DriveSafe

DocApp

Let's talk research, collaboration, or interesting ideas.

Sahal Shaji
Mullappilly