
Research Scientist Intern (TikTok-Privacy Innovation Lab-Multimodal Generative Model) - 2026 Start (PhD)
Skip the busywork
ApplyBolt rewrites your resume for this exact role and hits submit. You just pick the jobs.
About this role
Responsibilities
At TikTok, we treat privacy as our top priority in our product design and implementation. Privacy is not just about regulation compliance, but also about a more trusted way to enable technology innovation by respecting users' privacy choices!
About the Team
Privacy Innovation (PI) Lab is established to explore the next frontier of privacy technology and theory in the digitalized world. We provide key insights and technical solutions on privacy-related innovation for all TikTok's products. Furthermore, we also collaborate with worldwide technical and academic communities to build an open ecosystem to promote a privacy-friendly digital experience.
We are looking for talented individuals to join us for an internship in 2026. PhD Internships at our company aim to provide students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies.
PhD internships at our company provides students with the opportunity to actively contribute to our products and research, and to the organization's future plans and emerging technologies. Our dynamic internship experience blends hands-on learning, enriching community - building and development events, and collaboration with industry experts.
Applications will be reviewed on a rolling basis. We encourage you to apply early. Please state your availability clearly in your resume (Start date, End date).
About the Role
We are building next-generation generative foundation models, with a strong focus on diffusion-based and unified generation-understanding architectures, deployed in privacy-sensitive, production environments.
This role sits at the intersection of
- Large-scale model training systems
- GPU-first architecture and kernel-level optimization
- Diffusion / DiT / unified multimodal foundation models
- Privacy-preserving and compliant training pipelines
You will work on end-to-end training architecture design, from model-parallel execution and GPU efficiency to robust, fault-tolerant, privacy-aware training infrastructure.
Responsibilities:
You will participate in the architecture design and deep optimization of next-generation text-to-image and text-to-video models, including but not limited to:
- Develop a deep understanding of and optimize DiT + Flow Matching / Rectified Flow–based generative models
- Lead or contribute to the design and implementation of: Diffusion Transformer (DiT / MM-DiT) architecture improvements; Unified text-to-image / text-to-video model designs; Latent space, tokenization, and conditioning mechanisms.
- Perform joint algorithmic and system-level optimization, targeting: Training stability and convergence speed; Memory and compute efficiency; Generation quality and consistency
- Address challenges in long-sequence, high-resolution, and video generation, including: Efficient attention and temporal modeling strategies; Long-context and long-latent modeling
- Collaborate closely with systems and kernel engineers to map model designs to efficient implementations
- Reproduce, analyze, and advance state-of-the-art generative models (beyond simple replication)