Frontier Data Lead - Coding

Responsibilities

UI-Environments for Computer-Use/Browser-Use agents
MCP-based Environments for general function-calling agents across various enterprise and consumer applications
The Frontier Data Lead - Code will own end-to-end the creation of datasets, RL environments, and evals for frontier AI labs in the domain of coding agents and software engineering
This is a hands-on technical leadership role where you influence revenue directly - you will be mapped to one or more AI labs and interface directly with researchers / engineers at those labs to understand their needs and build data offerings to address those needs
To achieve this, you will build and manage teams of software engineers, researchers, QAs, and contractors/data-annotators from Turing's talent pool of 4M+ developers
You'll be responsible for delivering projects at frontier quality and scale-owning data quality, throughput, and timely delivery
You'll define and manage data pipelines, validation workflows, and review processes to ensure datasets meet the highest standards for realism, correctness, and diversity
You'll also develop automations, synthetic data generation systems, and internal tools to scale production efficiently
In short, you'll run your project like a startup within Turing, owning both the technical architecture and the operational execution required to produce best-in-class datasets/environments/evals to make the world's best coding agents and models even better at real-world coding tasks across the software development lifecycle
End-to-End Ownership: Data Quality, Process Design, and Team Building
Lead the creation of datasets, rl environments, and evals focused on Coding Agents / Software Engineering for one or more AI lab customers
Ensure that everything you ship to clients meets frontier standards for realism, correctness, diversity, and difficulty
Set up quality rubrics, automated validation scripts, and human review processes for every stage of data generation
Build and lead cross-functional teams of software engineers, researchers, QAs, and data creators drawn from Turing's 4M+ developer network
Interview, onboard, train, and mentor team members to ensure consistent output quality and technical excellence

Skills

Post-training experience on SWE tasks or experience building coding agents: We expect that you have a deep understanding of data ingredients and design principles that lead to measurable coding model improvements, either from fine-tuning models to improve SWE capabilities or building your own coding agents to improve upon SWE capabilities of the base model
Engineering Management experience: have led teams of engineers in the past, including interviewing/hiring them and setting up QA processes.Hands-on technical capability: Fluency in Python and proficiency in one or more major languages (C++, Java, Go, Rust, or JS)
Operational leadership: Proven ability to manage complex data pipelines, multi-stakeholder delivery, and concurrent high-stakes projects
Cross-functional communicator: ability to communicate clearly with researchers at frontier AI labs, subject matter experts for various domains, and diverse teams
Background in Computer Science, Machine Learning, or related technical field preferred

Benefits

💼 Amazing work culture (Super collaborative & supportive work environment; 5 days a week)
🌟 Awesome colleagues (Surround yourself with top talent from Meta, Google, LinkedIn etc. as well as people with deep startup experience)
💰 Competitive compensation
🕒 Flexible working hours

Qualifications

Post-training experience on SWE tasks or experience building coding agents: We expect that you have a deep understanding of data ingredients and design principles that lead to measurable coding model improvements, either from fine-tuning models to improve SWE capabilities or building your own coding agents to improve upon SWE capabilities of the base model
Engineering Management experience: have led teams of engineers in the past, including interviewing/hiring them and setting up QA processes.Hands-on technical capability: Fluency in Python and proficiency in one or more major languages (C++, Java, Go, Rust, or JS)
Operational leadership: Proven ability to manage complex data pipelines, multi-stakeholder delivery, and concurrent high-stakes projects
Cross-functional communicator: ability to communicate clearly with researchers at frontier AI labs, subject matter experts for various domains, and diverse teams
Coding is the core reasoning substrate of intelligence-advancing models' ability to understand, design, and write code is effectively advancing their capacity for logic, planning, and abstract thought
Real Impact (GDP): automating software engineering unlocks one of the largest productivity frontiers in history

Company

Turing

Location

San Francisco, CA

Salary

Range: Not specified
Period: Yearly
Currency: USD

Arrangement

Onsite

Category ID

1946

Term ID

1658

Plan ID

1109

Company Website

www.turing.com

Application Type

url

Application URL

Apply Here