We're in beta · Starting with US & Canada · Shipping weekly — your feedback shapes RiseMe
Stealth Startup logo
Stealth Startup Verified
Artificial Intelligence, SaaS, Software Development

Staff Platform Engineer

United StatesOnsiteFull TimeStaffPosted 2 months ago

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Sign up to see compensation estimate

Key Responsibilities

  • Lead design and operations of multi-cloud infrastructure (AWS, Azure, On Premises) and Kubernetes environments, including high-availability, autoscaling, and secure networking.
  • Architect and manage distributed compute systems that support large-scale, parallel workloads with efficient job scheduling and resource management.
  • Build resilient CI/CD pipelines using GitHub Actions and drive DevSecOps culture across teams.
  • Implement zero-trust networking and secure connectivity solutions using Tailscale.
  • Implement and maintain workspace automation (Coder Workspaces, Infrastructure as Code) to empower developers across projects.
  • Own end-to-end platform observability: performance tuning, cost optimization, alerting, and incident response using tools like Grafana and Prometheus.
  • Integrate and maintain secure authentication and identity management, leveraging OIDC, OAuth2, SSO, and RBAC.
  • Work closely with development teams to influence cloud-native architecture decisions.
  • Evaluate and adopt emerging technologies to continuously evolve platform
  • capabilities.
  • Write, build, and push quality compact and optimized container images to maximize workload performance at scale.

Tech Stack

  • Cloud Platforms: AWS (EKS, EC2, IAM, SSM, CloudWatch), Azure (AKS, AD, ARM)
  • Kubernetes: Helm, CRDs, Operators, k0s, HPA, Network Policies, Ingress Controllers
  • CI/CD & Automation: GitHub Actions, Terraform, Docker, ArgoCD
  • Networking & Security: Tailscale, WireGuard, SSO/OIDC, Vault, RBAC
  • Developer Tooling: Coder Workspaces, VSCode Remote, self-service portal frameworks
  • Distributed Systems: Argo Workflows, Apache Airflow, Dask/Spark (nice to have)
  • Observability: Prometheus, Grafana

Ideal Candidate Profile

  • 6+ years of experience building and operating cloud infrastructure in production environments.
  • Deep understanding of Kubernetes internals, custom controllers/operators, and containerized workflows.
  • Strong command of both AWS and Azure services and multi-cloud strategies.
  • Expertise in writing secure, reusable infrastructure-as-code and CI/CD pipelines
  • Proven experience managing distributed job scheduling, autoscaling workloads, and optimizing resource allocation, priority and queues.
  • Strong grasp of network and authentication architectures, including VPN, mesh networking, and identity federation

Preferred Experience

  • Experience with Tailscale, AKS, k0s, and lightweight K8s distributions.
  • Docker image building and CI/CD build and push automations.
  • Experience driving platform adoption and developer enablement at scale
Ready to apply?
You'll be redirected to Stealth Startup's application page.

Similar roles