Satsuma logo
Satsuma Verified
Health Technology, Wellness, Mobile Apps, SaaS

Senior Site Reliability Engineer

United StatesOnsiteFull TimeSeniorPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

Satsuma is seeking a Senior Site Reliability Engineer to manage its multi-cloud infrastructure (AWS, GCP, Azure) with a focus on reliability, scalability, and operations. The role involves building and maintaining CI/CD pipelines, observability stacks, and incident response workflows, defining SLOs/SLIs, and authoring IaC with Terraform. A key aspect is leveraging AI-assisted development for tooling and automation. The ideal candidate has 5-8 years of experience in SRE/DevOps, strong Kubernetes and observability tooling skills, and experience in high-growth SaaS environments. Familiarity with API gateways or commerce tech stacks is preferred.

About Satsuma

Satsuma is a commerce iPaaS that builds merchant-specific APIs, MCP Servers, and MCP Apps, enabling retailers to connect their full commerce stack once and deploy branded shopping experiences across every AI channel. We work with enterprise retailers and move fast. Our infra has to match.

The role

We're looking for a Senior SRE to own the reliability, scalability, and operational posture of Satsuma's multi-cloud infrastructure. You'll be the person who keeps things running, builds the systems that prevent fires, and makes on-call not terrible.

This is an infra-first role. But we're an AI-native company, and we expect you to use AI-assisted development (Claude Code) as a core part of your workflow — writing tooling, automating runbooks, building internal utilities.

What you'll do

  • Own infrastructure across AWS, GCP, and Azure environments
  • Build and maintain CI/CD pipelines, observability stacks, and incident response workflows
  • Define and enforce SLOs/SLIs; lead postmortems
  • Author and maintain IaC (Terraform preferred)
  • Write internal tooling and automation using AI-assisted development workflows
  • Partner closely with engineering on reliability reviews and architecture decisions

### Requirements

  • 5-8 years in SRE, DevOps, or infrastructure engineering
  • Hands-on experience across at least two major cloud providers
  • Strong Kubernetes, Terraform, and observability tooling (Datadog, Grafana, or equivalent)
  • Comfortable reading and editing code; able to ship scripts and internal tools
  • Experience with AI-assisted development (Copilot, Cursor, Claude Code)
  • On-call maturity -- you've owned incidents end-to-end and made systems better afterward
  • Prior experience at a startup or high-growth SaaS company
  • Familiarity with API gateway infrastructure or commerce tech stacks
  • Hands-on experience with MCP or agentic AI infrastructure

### Benefits

  • Unlimited PTO
  • 401(K)
  • Healthcare Stipend
  • Gym stipend
Ready to apply?
You'll be redirected to Satsuma's application page.

Similar roles