Senior Software Engineer - ML
Observo.ai
Software Engineering, Data Science
India · Remote
Posted on May 23, 2025
Key Responsibilities
- Design, build, and maintain the core ML infrastructure, including training pipelines, feature stores, model registries, and model serving infrastructure.
- Develop tools and platforms that streamline model training, evaluation, deployment, and monitoring at scale.
- Collaborate with ML Engineers and DevOps to establish CI/CD workflows for ML models, including validation, versioning, and rollout strategies.
- Optimize performance and reliability of large-scale distributed data processing and model inference systems.
- Establish observability and tracing for ML pipelines to track data drift, model performance degradation, and training anomalies.
- Contribute to security and compliance best practices across ML workflows, including access control and auditability.
- Drive technical decisions around tooling, frameworks, and architecture for our ML platform.
- Hands-on experience working with LLMs, including scaling third-party hosted inference services (e.g., OpenAI, Cohere, Anthropic) and building fault-tolerant, production-grade workflows around them.
Qualifications
- 5+ years of experience in software engineering, with at least 2 years focused on ML infrastructure, MLOps, or related systems.
- Strong programming skills in Python, Go, or Java, and experience with containerization (Docker, Kubernetes).
- Deep familiarity with ML infrastructure tools like MLflow, Kubeflow, Metaflow, TFX, SageMaker, or Vertex AI.
- Experience designing and running data pipelines using tools like Airflow, Prefect, or similar orchestration frameworks.
- Hands-on experience with cloud platforms such as AWS, GCP, or Azure, especially in the context of ML workloads.
- Solid understanding of CI/CD principles, especially as applied to machine learning workflows.
Preferred Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field.
- Prior experience in observability, telemetry, or real-time data processing.
- Experience in platform engineering with a focus on ML systems.
- Experience deploying models in low-latency or high-throughput production environments.
What We Offer
- Competitive salary and benefits package
- Opportunities for career growth and advancement
- A collaborative and innovative work environment
- Competitive stock option package
- Flexibility with remote work options
How to Apply
If you're excited about building cutting-edge backend solutions and want to contribute to scaling cloud-native applications, we'd love to hear from you. Please submit your resume and a writing sample to careers@observo.ai.