Machine Learning Engineer
datologyai
About the Company
Companies want to train their own large models on their own data. The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to model quality at worst. There is compelling research showing that smarter data selection can train better models faster—we know because we did much of this research. Given the high costs of training, this presents a huge market opportunity. We founded DatologyAI to translate this research into tools that enable enterprise customers to identify the right data on which to train, resulting in better models for cheaper.
Our team has pioneered deep learning data research, built startups, and created tools for enterprise ML.
We have raised over $11.5M from top-tier investors, including Amplify Partners, Radical Ventures, Conviction Capital, Jeff Dean, Yann LeCun, Geoff Hinton, Adam D’Angelo to help make our vision a reality.
This role is based in Redwood City, CA. We are in person 5 days per week and offer relocation assistance to new employees. We provide visa sponsorship for candidates selected for this role.
Learn more about the company here.
About the Roles
We're looking for an experienced Machine Learning Engineer to join as a member of our core Datology AI team. As one of our early senior hires, you will partner closely with our founders on the direction of our product and drive business-critical technical decisions.
You will contribute to developing our core product, starting from the main data curation pipeline. These are key components of our stack that allow us to process customer data and apply state of the art research for identifying the most informative data points in large-scale datasets. You will have a broad impact over the technology, product, and our company's culture.
As a Machine Learning Engineer at DatologyAI, you will be responsible for:
Architect, build, and deploy the ML systems and services that power our data curation platform
Design and implement large-scale data pipelines that curate datasets and make them ready for training cutting-edge models
Partner with researchers and engineers to bring new features and research capabilities to our customers
Ensure that our systems are reliable, secure, and worthy of our customers' trust
About You
There are a few specific things we’ll be looking for that will help you succeed in this role:
Have meaningful experience with leading and building production ML systems and platforms that deliver on major product initiatives.
Have a strong belief in the criticality of high-quality data and are highly motivated to work with the associated challenges
Have experience and evidence of reading, understanding, and implementing ML research papers
Proficiency in Python and in the most commonly used tools of the ML & Data Science ecosystem
Experience maintaining a high-quality bar for design, correctness, and testing.
Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed
Own problems end-to-end and are willing to pick up whatever knowledge you're missing to get the job done
We would love it if candidates have:
Experience conducting open-ended research to improve the quality of collected data
Experience running small scale ML experiments