I build production-grade AI systems that convert unstructured clinical notes into clinically accurate, analytics-ready structured data using LLMs, with reproducible pipelines and scalable team workflows.
RPI • IBM • Norstella
Automated relevant sections of clinical note selection, review, research, prompting, and post processing with agents and skills, removing deliverable heterogeneity while ensuring quality at scale.
Built a configurable, multi-threaded/multi-process ingestion pipeline across Redshift and OpenSearch, enabling teams to define cohorts dynamically and execute high-volume cohort selection reliably.
Combined structured data with unstructured clinical notes to engineer features, deploying 3-5 ensemble ML models to pool insights and deliver high, medium, and low confidence predictions.
Authored contributions spanning real-world healthcare data analysis, synthetic data generation, fairness evaluation, and production-grade software applications.
My Ph.D. thesis, "Synthetic Data Generation and Evaluation for Fairness", was completed at Rensselaer Polytechnic Institute under the guidance of my advisor and mentor, Dr. Kristin P. Bennett.
I have also collaborated with experts across academia and industry including Dr. Isabelle Guyon (ChaLearn, Google), Dr. John S. Erickson (RPI), Dr. Ioana Baldini (IBM), Dr. Dennis Wei (IBM), Dr. Jiaming Zeng (formerly IBM, now AKASA), Dr. Yooyoung Park (formerly IBM, now Moderna), and Thilanka Munasinghe (RPI).