Senior Site Reliability Engineer
Role summary
You.com is seeking a Senior Site Reliability Engineer to enhance the reliability, observability, and incident response of its production services. This role involves instrumenting services, developing SRE standards, building automation tools in Python, Bash, and Terraform, and creating actionable dashboards. The engineer will tune alerting, manage incident response, conduct postmortems, and define SLOs. A key aspect is integrating AI into SRE efforts to improve reliability and velocity. The position requires 2+ years in an SRE role, 3+ years with AWS (EKS) and CI/CD (Github Actions), and strong experience with Git, Python, and Bash. Experience establishing SRE practices across teams and maintaining Prometheus/Grafana monitoring is essential. The role is hybrid, based in San Francisco, with a salary range of $195,000 - $240,000 USD.
About Us
At You.com, we are building the AI Search Infrastructure that powers modern AI systems. Our goal is to create the trusted knowledge layer that agents, applications, and enterprises rely on to retrieve real-time, accurate, and citation-backed information.
Our platform combines proprietary vertical indexes with LLM-optimized retrieval systems to power AI agents, applications, and enterprise workflows. We are solving hard problems across search, large language models, and large-scale infrastructure to make AI systems more reliable, transparent, and useful.
Our team includes engineers, researchers, product builders, and operators who care about solving meaningful problems and delivering real-world impact. Whether you are improving core infrastructure, shaping product experiences, or helping bring new AI capabilities to market, your work will help define how modern AI finds and uses knowledge.
About the Role
As a Site Reliability Engineer, you will own parts of the reliability, observability, and incident response posture for You.com’s production services. Your work will ensure that every user query, every API call, and every data pipeline runs with measurable, defensible uptime, and when something breaks, the tools and dashboards you developed will help the team identify the issue, respond, and learn from it. Additionally, you will partner with teams to help them implement best practices, establish reliability objectives, and ensure the engineering team can build reliable services with minimal friction.
Responsibilities
Qualifications
Our salary bands are structured based on a combination of geographic tiers and internal leveling. Compensation is determined by multiple factors assessed during the interview process, with the final offer reflecting these considerations.
Company Perks:
Hubs in San Francisco and New York City offering regular in-person gatherings and co-working sessions
Flexible PTO with U.S. holidays observed and a week shutdown in December to rest and recharge*
A competitive health insurance plan covers 100% of the policyholder and 75% for dependents*
12 weeks of paid parental leave in the US*
401k program, 3% match - vested immediately!*
$500 work-from-home stipend to be used up to a year of your start date*
$1,200 per year Health & Wellness Allowance to support your personal goals*
The chance to collaborate with a team at the forefront of AI research
*Certain perks and benefits are limited to full-time employees only
You.com participates in E-Verify. We will provide the Social Security Administration (SSA) and, if necessary, the Department of Homeland Security (DHS) with information from each new employee’s Form I-9 to confirm work authorization. (English/Spanish: E-Verify Participation/Right to Work) We are also an inclusive, equitable, and accessible workplace. Please let us know if you require accommodation for any portion of the recruitment and hiring process.
Beware of recruiting scams: You.com will only contact you through official @You.com email addresses and will never ask for payment or sensitive personal information during the hiring process.
Similar roles
- Senior Site Reliability EngineerParallel Domain · Madrid, Comunidad de Madrid, Spain · Remote
- Site Reliability EngineerPacer Group · Montreal, Quebec, Canada · Hybrid
- Senior Site Reliability EngineerBlock Inc · New York, New York, United States · Remote
- Senior Site Reliability EngineerBlock Inc · Bay, California, United States · Remote
- Senior Site Reliability EngineerUplink · United States · Hybrid