hero

Join an outlier

Felicis portfolio companies are growing their teams in the U.S. and beyond.
202
companies
2,387
Jobs

Senior Site Reliability Engineer

Nayya

Nayya

Software Engineering
New York, NY, USA
Posted on Jan 14, 2025

About Nayya

At Nayya, our dreams are big. At least as big as our ambition. We’re determined to connect people’s most important information, so they can thrive across their health and wealth. We harness the existing market structures, ecosystems, and economic interests to unapologetically pursue our work and bring power to the populace.

About the Role

We are looking for a passionate and driven Senior Site Reliability Engineer (SRE) to join our growing engineering team at Nayya. As a Senior SRE at a fast-paced, growth-stage startup, you will be a key player in building and scaling our platform with a focus on reliability, automation, and performance. You will contribute to both technical and strategic decisions while shaping the future of our infrastructure and operational processes. We are seeking a candidate who thrives in an environment that prioritizes impatience, excellence, resilience, and courage—someone who is excited about making an immediate impact while pushing the boundaries of what’s possible.

You will work alongside a talented team of engineers to develop innovative solutions, tackle complex reliability challenges, and maintain a strong focus on high-quality service delivery. You will have the opportunity to drive key initiatives and influence the evolution of our platform while fostering a culture of collaboration, continuous improvement, and technical excellence.

Key Responsibilities

Technical Leadership & Execution

  • Build, Design, and Scale: Architect and implement scalable, highly available systems and automation frameworks that improve reliability and reduce manual intervention.
  • Collaborate Across Teams: Partner with product, software engineering, and data teams to define and implement best practices for reliability, performance, and scalability.
  • Drive Engineering Excellence: Establish high standards for infrastructure as code, observability, and performance tuning. Advocate for best practices in system design and incident management.
  • Innovate with Agility: Adapt quickly to evolving business needs and emerging technologies, delivering incremental improvements with a focus on learning and iteration.

Mentorship & Team Development

  • Support Growth: Mentor engineers on reliability practices, tooling, and mindset. Promote a culture of ownership, learning, and continuous improvement.
  • Foster Collaboration: Lead by example in creating a collaborative, open environment where diverse perspectives are valued, and challenges are met with creativity.

Continuous Improvement & Agile Practices

  • Iterate Quickly: Promote a growth mindset by embracing iterative processes, continuously refining reliability practices and infrastructure.
  • Optimizing for Speed and Stability: Balance rapid delivery with system stability and performance, ensuring reliable deployment pipelines and minimal downtime.

Skills and Qualifications

  • 5+ years of professional experience in Site Reliability Engineering, DevOps, or related roles, ideally at a fast-paced startup or growth-stage company.
  • Proven track record of building and maintaining high-performance, scalable systems.
  • Expertise in at least one modern programming language such as Python, Ruby, Go, JavaScript, or similar.
  • Extensive experience with AWS, specifically with VPC networking, Route 53, ECS, Lambda, API Gateway, and RDS (Postgres/aurora)
  • Experience with provisioning data infrastructure (EMR, glue, redshift, step functions, Athena)
  • Knowledge of best practices for configuring CI/CD pipelines (Github Actions preferred)
  • Deep understanding of infrastructure as code (Terraform preferred)
  • Strong knowledge of site reliability practices, incident management, monitoring, and alerting. Familiarity with DataDog or similar observability platform(s) and tooling is a plus.
  • Embodies the mindset & values of:
    • Agility: Ability to adapt to rapidly changing priorities and shifting technical landscapes.
    • Excellence: Commitment to high standards for reliability, performance, and scalability.
    • Courage: Willingness to take calculated risks and step up to complex technical challenges.

Preferred Qualifications

  • Experience in a growth-stage startup or similar high-growth company.
  • Familiarity with microservices, serverless architectures, and cloud-native technologies.
  • Experience using metrics, SLIs, and error budgets to guide reliability improvements.
  • Contributions to open-source projects or active participation in SRE communities.

The salary range for New York based candidates for this role is $147,000 - $185,000. We use a location factor to adjust this range for candidates that are located outside of geographic region of our New York office. Placement within the salary band is determined based on experience.

#LI-DD1

#LI-HYBRID

Nayya is proud to be an Equal Employment Opportunity employer. We do not discriminate based upon race, religion, color, national origin, gender (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics