Senior Data Scientist

Bridging Healthcare Data with Actionable Insights

16+ years leading predictive modeling, causal inference, and NLP initiatives at CDC. Deploying end-to-end ML pipelines in Azure & Databricks.

HeeKyoung Chun

About Me

Senior Data Scientist with a Doctor of Science degree and 16+ years transforming complex healthcare data into actionable insights at CDC.

Bridging technical analysis and executive decision-makingβ€”translating statistical results into recommendations leadership can act on.

ML Pipelines Causal Inference Healthcare NLP Communication

Technical Skills

Languages

PythonPySparkSQLRSAS

NLP & AI/ML

HuggingFaceSpark NLPPyTorchscikit-learn

Cloud

Azure MLDatabricksData LakeSnowflake

MLOps

CI/CDVersioningMonitoringMLflow

Analytics

Causal InferencePredictive ModelingExperiment Design

Certifications

Azure AI-900Azure DP-900Azure AZ-900

Featured Projects

COVID-19 response, NLP analytics, predictive modeling

🦠

COVID-19 Surveillance

Predictive models and NLP for CDC surveillance data

πŸ“Š

NVDRS NLP Pipeline

Entity extraction using Spark NLP

πŸ“ˆ

Medicaid Impact

Propensity score models for policy analysis

☁️

Azure Data Pipeline

CI/CD for reproducible ML workflows

Experience

2022–2025

Statistician / Data Scientist

CDC (DNI Contractor)

Lead predictive and causal models. Advanced NLP analyses.

2019–2022

Data Analyst

CDC (Goldbelt, TJFACT)

NVDRS and COVID-19 surveillance analytics.

2008–2017

Economist / Health Scientist

CDC, NIOSH

Cost-effectiveness modeling for prevention programs.

2014–2020

Adjunct Faculty

Georgia Southern University

Biostatistics and epidemiological modeling research.

Publications

Chun H, et al. (2025) "Evaluation of the Reliability and Validity of the Perceptions of Skills Enhanced Through School Health Education (PSE-SHE) Measure" – Journal of School Health

doi.org/10.1111/josh.70038

Chun H, et al. (2020) "Maternal exposure to air pollution and risk of Autism in children: A systematic review and meta-analyses" – Environmental Pollution

doi.org/10.1016/j.envpol.2019.113307