Senior Internal Infrastructure Engineer

San Francisco, California, United StatesRemoteFull Time$250,000–$300,000 /yrPosted today

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Location: Remote from Boston or any major California city, or Hybrid 3 days per week in Arlington, VA

Compensation: $240K–$270K base + equity (targeting $300K+ total comp).
*Flexible for exceptional candidates.*

About the Company

Our client is a profitable, fast-growing, venture-backed maritime technology company building advanced sensing and intelligence systems for commercial and defense environments.

The growing team comes from top-tier startups and technical organizations, with deep relationships across the Department of Defense and national security ecosystem.

They are building mission-critical, edge-intelligent systems that operate in real-world environments where reliability, security, and performance are non-negotiable.

Why This Role

This is not a maintenance role. This is a build-and-own the platform role.

You will design and operate the internal infrastructure that powers everything from cloud systems → edge deployments → AI workloads → real-time sensor pipelines.

This is a role for someone who:

Has operated production systems at scale
Thinks in systems, not tickets
Cares deeply about reliability, security, and speed
Wants to build infrastructure that actually matters

You will have real ownership over how infrastructure is designed, deployed, and scaled in a company that is rapidly growing and supporting important real-world use cases.

Role Overview

Our client is seeking a Senior Internal Infrastructure Engineer to lead the design and operation of secure, multi-environment platform infrastructure across Azure, Azure Government, and AWS.

You will work at the intersection of platform engineering, SRE, security, and distributed systems, enabling engineering teams to ship faster while maintaining strict reliability and security standards.

This role spans GitOps, infrastructure as code, Kubernetes, observability, edge networking, and support for IoT, streaming, and ML workloads.

They are looking for someone exceptional. Someone who can own systems end-to-end and raise the bar across the entire platform.

What You’ll Work On

Architecting and operating secure infrastructure across Azure, Azure Gov, and AWS
Building GitOps pipelines and reusable infrastructure modules (OpenTofu / Terraform)
Running and scaling Kubernetes platforms (Helm, multi-cluster environments)
Designing observability systems (metrics, logs, traces, alerting with Grafana)
Supporting IoT, streaming, and real-time pipelines (AWS IoT, Kinesis)
Operating edge networking and distributed sensor deployments
Enabling secure ML/AI workloads across cloud and edge environments
Strengthening platform security (IAM, secrets, encryption, policy, zero trust)

How They Think About Engineering

Build systems that produce work, not one-off fixes
Automate everything that can be automated
Use AI as a force multiplier across development and operations
Create guardrails that allow engineers to move fast safely
Build platforms that other engineers love to use

What You Bring

- 7+ years in infrastructure, platform, or SRE roles with ownership of production systems
- Deep experience in Azure, ideally in
regulated or high-security environments (Azure Gov)
- Strong AWS experience, especially with
IoT, Kinesis, and streaming architectures
- Expert-level Kubernetes experience (including Helm)
- Strong GitOps background with a track record of improving delivery systems
- Deep experience with infrastructure as code (Terraform / OpenTofu)
- Strong observability experience (Grafana, modern telemetry stacks)
- Experience with edge systems, distributed deployments, or remote telemetry pipelines
- Experience supporting ML/AI workloads in production environments
- Strong security depth across IAM, networking, secrets, and policy enforcement

You should be someone who sees systems clearly, identifies weaknesses quickly, and fixes them permanently.

Nice to Have

Experience with FedRAMP, NIST, CMMC, IL4/IL5, or similar frameworks
Experience with service meshes, policy engines (OPA/Gatekeeper), or supply chain security
Experience with multi-cluster or hybrid cloud + edge Kubernetes environments

What Success Looks Like

Faster delivery with fewer failures
Secure, auditable infrastructure changes through GitOps
Reliable operation of distributed systems across cloud and edge
Strong, scalable foundations for IoT, streaming, and ML workloads

Ready to apply?

You'll be redirected to Service to Success's application page.