Data Science Intern
Skip the busywork
ApplyBolt rewrites your resume for this exact role and hits submit. You just pick the jobs.
About this role
PRN (as needed) 15-20 hours/week
Monday-Friday variable days schedule
Must be able to commit to stay in position for at least a year
Hybrid remote, 1-2 days/week onsite
Summary:
The Data Science Intern supports the Rebecca D. Considine Research Institute by applying machine learning and natural language processing (NLP) techniques to clinical and research data. This role provides hands-on experience working with both structured and unstructured healthcare data to support predictive modeling, clinical research, and data-driven decision making. The intern works under mentorship within a multidisciplinary research environment and contributes to projects with direct relevance to patient care and clinical research outcomes.
Responsibilities:
1. Develop and validate predictive models using structured clinical data such as EHR-derived variables, diagnostic codes, laboratory values, and demographic information.
2. Apply appropriate machine learning methodologies and evaluate model performance using clinically meaningful metrics.
3. Use natural language processing techniques to extract structured information from unstructured clinical text, including clinical notes and operative reports.
4. Collaborate with clinical investigators and research teams to ensure analytical approaches are clinically grounded and aligned with research goals.
5. Document data preparation, modeling workflows, code, and results using reproducible and transparent practices.
6. Present methods and findings to multidisciplinary teams and contribute to research discussions, abstracts, and manuscripts as appropriate.
Other information:
Technical Expertise
1. Proficiency in R or Python for data analysis and machine learning applications
2. Working knowledge of machine learning fundamentals, including model selection, validation, and performance evaluation.
3. Experience working with tabular datasets for supervised learning tasks.
4. Familiarity with version control and reproducible research workflows.
5. Experience with NLP techniques, clinical language models, or healthcare data sources such as EHR systems and clinical coding standards preferred.
Education and Experience
1. Education: Currently enrolled in a graduate degree program (MS or PhD) in Data Science, Computer Science, Biomedical Informatics, Statistics, or a closely related field. Completion of at least one year of higher education coursework prior to the internship. Must be returning to at least one additional semester of school following the internship.
2. Licensure: None
3. Certification: None
4. Years of relevant experience: Prior experience working with healthcare data, machine learning, or research datasets is preferred.
5. Years of supervisory experience: None
On Call
FTE: 0.001000
Status: Fixed Hybrid