Arcline
San Francisco

Agentic AI R&D Intern — Summer 2026

Hybrid$45 – $60/hrVisa SponsorshipPosted 6 days agoLinkedIn

Skip the busywork

ApplyBolt rewrites your resume for this exact role and hits submit. You just pick the jobs.

Resume tailored to this roleApplied in secondsTrack every application
Download the app

About this role

Arcline builds AI-powered data tools that help K-12 school districts turn fragmented student data into clear, actionable decisions. We work with superintendents and district leaders across Alabama, California, Kentucky, Texas, Wisconsin, and more — replacing months of manual reporting with instant, AI-driven answers.

We're a small, AI-native team that ships fast and builds with real users. Our interns ship real features to real classrooms.

When an educator asks "which Title I schools have declining math scores and rising chronic absenteeism?" — that question can't be answered with a single database call. It requires decomposing the question into sub-tasks, routing each to the right data source, reasoning about how the pieces fit together, and synthesizing a cited answer. That's an agent orchestration problem, and it's the core of what you'll work on.

This is an R&D role. You'll prototype and evaluate multi-agent architectures that push beyond our current single-pass RAG pipeline — designing systems where specialized agents collaborate to handle the hardest queries our districts throw at us. You'll also explore proactive agent workflows: systems that autonomously surface funding gaps, compliance risks, and at-risk students before anyone thinks to ask.

You'll have mentorship from the founding team and room to learn — we're looking for curiosity and initiative, not a finished researcher.

Day to day

  • Prototype multi-agent orchestration systems that decompose complex educator queries into coordinated sub-tasks across multiple data sources
  • Implement tool-use and function-calling patterns that let agents query databases, generate reports, and cross-reference student records
  • Build evaluation harnesses for agent reliability — testing that multi-step agent chains produce correct, cited answers
  • Experiment with agent memory, state management, and context-passing strategies for multi-turn workflows
  • Benchmark orchestration approaches (e.g., ReAct, plan-and-execute, hierarchical agents) against real district query workloads and share findings with the team

Requirements

  • Currently pursuing a B.S./B.A. or M.S./Ph.D. in Computer Science, Machine Learning, AI, or a related field
  • Strong Python skills and hands-on experience building with LLM APIs (OpenAI, Anthropic)
  • Interest in and familiarity with agentic AI patterns — tool use, planning, multi-step reasoning, or multi-agent workflows
  • A research mindset — you're comfortable running experiments, tracking metrics, and writing up what worked and what didn't
  • Able to commit to a 10-week internship from June 1 to August 10, 2026. Based in San Francisco with remote positions also available. Relocation assistance provided for on-site roles

Bonus qualifications

  • Hands-on experience with agent frameworks — LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, or custom architectures
  • Projects or research involving multi-agent systems, agent planning, or autonomous AI workflows
  • Familiarity with RAG systems, vector databases (pgvector, Pinecone), or retrieval-augmented agent architectures
  • Experience with structured output, function calling, or constrained generation from LLMs
  • Experience with agent evaluation and benchmarking — measuring reliability, tool-use accuracy, and reasoning quality
  • Coursework or projects in NLP, information retrieval, or reinforcement learning
  • Interest in education, public sector tech, or applying AI research to real-world product problems
  • AI-native development habits — you use LLM tools like Cursor, Claude Code, GitHub Copilot, Codex, or anything else to write, debug, and ship code faster