Prolaio
Chicago, IL

Data Science (Biomarkers) Intern

Hybrid$50/hrPosted 6 days agoLinkedIn

Skip the busywork

ApplyBolt rewrites your resume for this exact role and hits submit. You just pick the jobs.

Resume tailored to this roleApplied in secondsTrack every application
Download the app

About this role

What Will You Do?

The Overview

As a Data Science Intern on the Biomarkers Team, you will develop and validate advanced machine learning pipelines focused on long term time series analysis. Your primary objective is to leverage Foundational Time Series Models and Transfer Learning - such as adapting models from cardiac arrhythmia ECG detection to extract novel physiological insights from wearable and clinical sensor data. This role is essential for advancing our understanding of patient health by translating raw signal data into validated digital biomarkers.

The Specifics

  • Model Development & Transfer Learning: Test existing and fine-tune deep learning architectures for time series data, specifically utilizing transfer learning techniques to adapt pre-trained ECG-based cardiac models for new physiological signal tasks.
  • Signal Processing Pipeline: Build and optimize Python-based signal processing workflows to handle noisy, real-world sensor data, including filtering, feature extraction, and artifact removal.
  • Foundational Model Implementation: Research and implement emerging foundational time series models to evaluate their zero-shot or few-shot performance on proprietary longitudinal cardiac datasets.
  • Validation & Benchmarking: Design rigorous validation frameworks comparing model outputs against a clinician-verified "ground truth" to establish metrics like Mean Absolute Error (MAE) and Intraclass Correlation (ICC).
  • Codebase Delivery: Maintain a clean, documented, and reproducible code repository that transforms raw high-frequency signals into structured, analysis-ready biomarker datasets.

Who You Are?

  • Academic Background: Currently enrolled in a Master’s or PhD program in Data Science, Electrical Engineering, Biomedical Engineering, Computer Science, or a related quantitative field.
  • Technical Proficiency: Strong proficiency in Python and deep learning frameworks (e.g., PyTorch or TensorFlow) with a specific focus on time series or signal data.
  • Signal Processing Expertise: Familiarity with digital signal processing (DSP) techniques, such as Fourier transforms, wavelet analysis, and windowing methods for physiological data.
  • Machine Learning Knowledge: Solid understanding of modern ML architectures (CNNs, RNNs, Transformers) and experience with Transfer Learning or fine-tuning large-scale models.
  • Data Handling: Experience working with real-world sensor data, handling irregular sampling rates, and managing large-scale longitudinal datasets.