NIO
San Jose, US

LLM Algorithmic Optimization Engineer - Intern

Onsite$38 - $46/hrPosted Mar 21, 2026

Job details

Location
San Jose, US
Work type
Onsite
Compensation
$38 - $46/hr
Posted
Mar 21, 2026
Apply on
nio.wd3.myworkdayjobs.com

About this role

Roles and Responsibilities:

  • Conduct research and apply cutting-edge technologies to optimize Large Language Models (LLMs) and multimodal models, exploration and implementation of the core algorithmic optimization on heterogeneous architectures, for highly efficient LLM inference as well as deployment across distributed and heterogeneous hardware environments.
  • Focus on model optimization from a systems perspective, ensuring efficient deployment in the vehicle’s digital cockpit and advanced driving (AD) domain.
  • Collaborate with cross-functional teams to ensure the integration of optimized models into real-world automotive applications.
  • Contribute to the entire pipeline from research, development, and testing, through to deployment on hardware, including GPUs and other distributed systems.

Qualifications:

  • Currently pursuing or completed a PhD or Master’s degree in Computer Science, Computer Engineering, Applied Mathematics, Communications, Electronics, or a related field with relevant research projects and publications.
  • Strong understanding of GPU/NPU architecture and optimization techniques to identify and address bottlenecks.
  • Proficient in LLM and VLM architectures and algorithms, familiar with transformer based NLP / Audio / CV algorithms and technologies.
  • Proficiency in Python and experience with AI-related training and inference tools such as PyTorch.
  • Proficiency in C/C++ programming, familiar with at least one commonly used LLM inference engines.
  • Hands-on experience with model-serving frameworks such as Open Neural Network Exchange (ONNX).
  • Familiarity with debugging code in distributed computing environments.Experience in LLM inference optimization on resource constrained edge devices is a plus.

Preferred Qualification:

  • Ph.D. in computer science, artificial intelligence, or related fields; or Masters degree + 3 years of relevant industry experience
  • Experience in inference optimization techniques of deep learning models or libraries on hardware architectures;
  • Familiar with microkernel architecture, Linux kernel, hypervisor, middleware, and application framework
  • Those who have good publication records and have published high impact, innovative papers are preferred

About NIO

NIO
San Jose, US