Head of ML Infra
ALL SOURCEGRAPH ROLES ARE FULLY REMOTE
Who we are
Our mission at Sourcegraph is to make it so that everyone can code, not just ~0.1% of the population. Our code AI platform helps developers and companies with billions of lines of code create the software you use every day. By enabling more people to code, we believe we will create economic opportunity across the world and will drive progress that benefits everyone.
It’s an exciting time to join Sourcegraph. Our business is growing rapidly: we’ve experienced exponential growth and our $125M Series D from Andreessen Horowitz and $50M Series C from Sequoia have given us the opportunity to make big ambitious bets on our future. We have a huge market (every company that builds software) and massive opportunity (most developers haven't even heard of code AI yet, but once you've used it, you can't live without it--just like Google). By continuing to hire exceptional people, we have the opportunity to make Sourcegraph one of the biggest technology companies in the world.
🌎 Given that we are an all-remote company and hire almost anywhere in the world, we don’t have a particular time-zone preference for this role. However, you may need to be available for non-recurring urgent meetings outside of working hours.
Why this job is exciting
We are creating a machine learning team at Sourcegraph, aimed at creating the most powerful coding assistant in the world. Many companies are trying, but Sourcegraph has a unique advantage: Our rich code graph. In the world of prompting LLMs, context is key, and for creating the right context, Sourcegraph’s code data is simply the best you can get: IDE-quality, global-scale, and served lightning fast. Cody is already outperforming the pack, but we aim to take the lead in machine learning advancements on coding assistant quality. You can help us unlock Cody’s full potential, delivering a product that accelerates development in a way we only see every 10-15 years.
To head up this effort, we are looking for a seasoned and deeply technical ML-engineering leader, with a strong AI background and experience with both smaller models and the new LLM ecosystem, who can help us deliver the world’s best coding assistant and ML-powered developer tooling. And if you happen to have an entrepreneurial streak, you’re in luck: We have an enterprise distribution pipeline, so whatever you build can be deployed straight to enterprise customers with some of the largest codebases in the world, without all the go-to-market hassle you’d encounter in a startup.
Within one month, you will…
Meet your team, which consists initially of 3 to 5 ML engineers (2 already on the team)
Start building a trusting relationship with your direct reports and peers.
Come up to speed on the current state of machine learning in the Cody ecosystem.
Be set up for local development and familiar with Cody’s architecture.
Define our short-term roadmap for ML Infrastructure on GCP.
Ship a substantial feature, experiment, or evaluation.
Within three months, you will…
Set up the at-scale infrastructure for running benchmarks that compare coding assistants.
Have defined a strategy for how we will address getting GPUs at scale for various personas.
Have defined a rough roadmap for how to cost-optimize our ML spend.
Have defined our on-prem/self-hosted roadmap and recommended configurations for ML infra.
Be up to speed and driving Sourcegraph’s ML Infra strategy.
Within six months, you will…
Have hired a world-class team of ML engineers.
With the help of our research team, have delivered a ML-driven quality, benchmarking, and evaluation framework for coding assistants that runs at scale
Have established a longer-term roadmap that keeps us aligned with expected advances in LLMs.
Be running dozens to hundreds of experiments with prompting, embedding, fine-tuning and other techniques.
You have been working squarely in ML Infra since LLMs landed, if not longer.
- You’re deeply familiar with at least one end-to-end system for ML pipelines at scale, and you are broadly familiar with the competition in the space and what options are available, and when.
- In an ideal world, you are most deeply familiar with GCP’s machine learning stack, and you have a lot of practice operationalizing PyTorch experiments on that stack. It’s also great if you have Apache Spark in general.
- You should be the kind of person who lives and breathes GPUs, and you should come armed with opinions about how best to deploy and cost-optimize Cody for our various customer classes, from large enterprises to casual hackers, particularly when it comes to the Cloud-side deployments.
- In a perfect world, you would already be comfortable with options that enterprise customers might want for self-hosted ML infra, for running their own pipelines, e.g. other Cloud-hosted offerings, and/or OSS. Although we are pushing hard to have everything on GCP, the market is evolving rapidly and we could, for instance, come across customers who want to provide their own GPUs.
- Any familiarity you have with deploying enterprise SaaS is a huge bonus because it is a part of the role. However, it’s something that you can pick up if you are already familiar with Cloud options.
- Bonus if you have any background in graph theory or anything that would be relevant to our code graph, which plays a key role in the production of both training data and in acting as a source of truth for verifying model outputs.
- We would love it if you are actively following developments in open-source models and training systems, and can come prepared with opinions about when and to what extent we should adopt them. Or more importantly, how we set up infrastructure that will tell us when they are ready, by evaluating their performance on Cody tasks.
Best of all, we’d love it if you already have an opinion about Cody, have tried it, and already have a vision for how you can help us make it even better!
📊 This job is an M4. You can read more about our job leveling philosophy in our Handbook.
💸 We pay you an above-average salary because we want to hire the best people who are fully focused on helping Sourcegraph succeed, not worried about paying bills. You will have the flexibility to work and live anywhere in the world (unless specified otherwise in the job description), and we’ll never take your location or current/past salary information into account when determining your compensation. As an open and transparent company that values equitable and competitive compensation for everyone, our compensation ranges are visible to every single Sourcegraph Teammate. To determine your salary, we use a number of market and data-driven salary sources and target the high-end of the range, ensuring that we’re always paying above market regardless of where you live in the world.
💰 The target compensation for this role is $243,000 USD base.
📈 In addition to our cash compensation, we offer equity (because when we succeed as a company, we want you to succeed, too) and generous perks & benefits.
Interview process [~5.5 hour total interview]
Below is the interview process you can expect for this role (you can read more about the types of interviews in our Handbook). It may look like a lot of steps, but rest assured that we move quickly and the steps are designed to help you get the information needed to determine if we’re the right fit for you… Interviewing is a two-way street, after all!
We expect the interview process to take ~5.5 hours in total.
👋 Introduction Stage - we have initial conversations to get to know you better…
- [30m] Recruiter Screen with Grace Bohl
- [45m] Technical Background with Beyang Liu
- [30m] Hiring Manager Screen with Steve Yegge
🧑💻 Team Interview Stage - we then delve into your experience in more depth and introduce you to members of the team…
- [60m] Resume Deep Dive with Grace Bohl
- [45m] Technical Deep Dive with Dominic Cooney
- [45m] Interview with Rok Novosel and Rafal Gajdulewicz
- [Async] Coding Exercise
🎉 Final Interview Stage - we move you to our final round, where you meet cross-functional partners and gain a better understanding of our business and values holistically…
- [30m] Values Interview
- [30m] Leadership Interview with Quinn Slack
- We check references and conduct your background check
Please note - you are welcome to request additional conversations with anyone you would like to meet, but didn’t get to meet during the interview process.
Not sure if this is you?
We want a diverse, global team, with a broad range of experience and perspectives. If this job sounds great, but you’re not sure if you qualify, apply anyway! We carefully consider every application, and will either move forward with you, find another team that might be a better fit, keep in touch for future opportunities, or thank you for your time.
Learn more about us
To create a product that serves the needs of all developers, we are building a diverse all-remote team that is distributed across the world. Sourcegraph is an equal opportunity workplace; we welcome people from all backgrounds and communities.
Learn more about what it is like to work at Sourcegraph by reading our handbook.
We want to ensure Sourcegraph is an environment that suits your working style and empowers you to do your best work, so we are eager to answer any questions that you have about us at any point in the interview process.
Go back to the careers page for all open positions.
Sourcegraph participates in E-Verify for U.S. Employees