Senior Data Scientist
16+ years leading predictive modeling, causal inference, and NLP initiatives at CDC. Deploying end-to-end ML pipelines in Azure & Databricks.
Senior Data Scientist with a Doctor of Science degree and 16+ years transforming complex healthcare data into actionable insights at CDC.
Bridging technical analysis and executive decision-makingβtranslating statistical results into recommendations leadership can act on.
COVID-19 response, NLP analytics, predictive modeling
Predictive models and NLP for CDC surveillance data
Entity extraction using Spark NLP
Propensity score models for policy analysis
CI/CD for reproducible ML workflows
2022β2025
CDC (DNI Contractor)
Lead predictive and causal models. Advanced NLP analyses.
2019β2022
CDC (Goldbelt, TJFACT)
NVDRS and COVID-19 surveillance analytics.
2008β2017
CDC, NIOSH
Cost-effectiveness modeling for prevention programs.
2014β2020
Georgia Southern University
Biostatistics and epidemiological modeling research.
"Evaluation of the Reliability and Validity of the Perceptions of Skills Enhanced Through School Health Education (PSE-SHE) Measure" β Journal of School Health
doi.org/10.1111/josh.70038
"Maternal exposure to air pollution and risk of Autism in children: A systematic review and meta-analyses" β Environmental Pollution
doi.org/10.1016/j.envpol.2019.113307
Send me a message or reach out via social links.
Senior Data Scientist | Healthcare | NLP | MLOps
hkchun1@yahoo.com
16+ years leading predictive and causal modeling in healthcare. Expert in Azure, Databricks, NLP, and MLOps.
Statistician / Data Scientist β CDC (2022β2025)
Data Analyst β CDC (2019β2022)
Economist / Health Scientist β CDC (2008β2017)
Sc.D., Work Environment Policy β UMass Lowell
M.A., Economics β Boston University
Python, PySpark, SQL, R, SAS, HuggingFace, Spark NLP, Azure ML, Databricks, MLflow
Azure AI-900, DP-900, AZ-900