Staff Applied Researcher, AI Quality

GitHubRichmond, Virginia, USPosted 4d ago

Full-timeOn-site

About the Role

Locations In this role you can work from Remote, United States Overview At GitHub, we’re building the next generation of AI‑powered developer experiences. We’re looking for a Staff Applied Researcher with deep expertise in Large Language Model (LLM) evaluation, LLM agents, strong engineering instincts, and a bias for action to help shape the future of GitHub Copilot and our AI platform. This is a high‑impact role where you will design evaluation systems that directly influence how millions of developers experience AI every day. Responsibilities Lead Model Quality & Evaluation Design next‑generation evaluation frameworks for code generation, reasoning, safety, multimodal tasks, and agentic workflows. Develop scalable automatic metrics, LLM‑judge systems, reward models, and human‑in‑the‑loop evaluation pipelines. Establish high‑signal, repeatable methodologies that influence product decisions across GitHub AI. Drive Applied Research & Engineering Build and optimize evaluation tooling, datasets, benchmarking systems, and experimentation pipelines. Create and onboard new benchmarks for the hardest tasks for the coding agents. Collaborate closely with engineering teams to productionize research, validate improvements, and accelerate model iteration cycles. Own end‑to‑end quality insights for the models behind GitHub Copilot and new AI features. Work closely with product development, engineering, and design teams to integrate advanced research findings into practical applications, ensuring alignment with product goals and user needs. Influence, Mentor & Lead Shape GitHub’s strategy for model quality, alignment, and evaluation. Mentor other researchers and engineers, helping elevate technical standards across the organization. Drive clarity in ambiguous problem spaces and champion fast, high‑quality execution. Qualifications

Interested in this role?

Create a tailored CV that highlights your most relevant experience for this position.