Senior DevOps Engineer

Chicago, Illinois, United StatesRemoteFull TimeSeniorPosted 1 month agoVisa sponsorship available

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

No H1 or C2C. Must be Permanent Resident or US Citizen

Senior DevOps Platform Engineer

Description and Requirements

About Our Team

We are building
Quantum
, a next‑generation hybrid AI platform that spans Windows, Android, and cloud. As part of this vision, we are expanding the engineering organization supporting cross‑device Personal AI.

We are hiring
Senior DevOps / Platform Engineers
to build and operate the core automation, infrastructure, and service platforms that enable secure, reliable, and high‑velocity delivery of AI systems across device, edge, and cloud.

Depending on your background, you may be aligned to Platform Engineering, Observability, Operations, or Service Reliability—based on experience and organizational need.

Operates with the
speed, ownership, and creativity of a startup
, supported by the scale, resources, and technical depth. We are building foundational systems from the ground up—intentionally, pragmatically, and with a culture of engineering excellence.

Location:
Open to remote work in the US. The preferred work location is Chicago, IL.

What You Might Work On

As a Senior DevOps / Platform Engineer, you may be responsible for a subset of the following areas depending on team placement:

CI/CD, Automation & Tooling

- Designing, implementing, and improving
CI/CD pipelines
for AI, platform, and application teams.
- Building automation and developer tooling to improve productivity and consistency.
- Developing
infrastructure‑as‑code
for cloud and hybrid environments (Terraform, Bicep, etc.).

Platform & Infrastructure Engineering

- Implementing scalable, secure, and resilient infrastructure on
Azure
and Kubernetes.
- Building and operating hybrid systems spanning
device, edge, and cloud compute
.
- Enabling reliable platform services that support inference, data pipelines, and high‑performance AI workloads.

Observability & Telemetry

- Implementing and enhancing observability systems using
OpenTelemetry
,
Grafana
, Prometheus, Loki, and related tooling.
- Ensuring platform telemetry is accurate, actionable, and tied to performance and reliability outcomes.
- Building dashboards and analytics for service health and operational insight.

Deployment & Release Engineering

Improving deployment workflows, safety, consistency, and traceability.
Supporting progressive delivery patterns including canaries, staged rollouts, and automated rollbacks.
Optimizing CI/CD and deployment tooling for hybrid AI services.

Collaboration & Reliability Culture

Partnering closely with SRE, AI/ML, security, firmware, and product engineering teams.
Contributing to system design discussions with a focus on automation, scalability, and operational best practices.
Helping define and evolve platform engineering standards, patterns, and conventions.

Basic Qualifications

- 10+ years
in DevOps, Platform Engineering, Cloud Engineering, or related fields
- Bachelor’s Degree in Computer Science, Engineering, or a related technical field
- Strong experience building and operating infrastructure in Azure, AWS, or GCP
- Proficiency with
CI/CD systems
, build automation, and deployment pipelines
- Experience with
Infrastructure as Code
(Terraform, ARM/Bicep, CloudFormation, etc.)
- Strong development or scripting skills (Python, Go, Bash, or similar)
- Hands-on experience with
Docker
and
Kubernetes
- Understanding of observability fundamentals (metrics, logs, tracing)

Preferred Qualifications

- Deep experience with
Azure
cloud architecture and DevOps tooling
- Strong hands‑on work with
OpenTelemetry
(instrumentation, pipelines)
- Experience with
Grafana
, Prometheus, Loki, Tempo, or similar observability tools
- Experience supporting
AI/ML workloads
or GPU‑accelerated compute environments
- Familiarity with event‑driven systems and operationalizing data pipelines
- Experience contributing to or running on‑call rotations
- Passion for automation, developer experience, and infrastructure reliability at scale

What Success Looks Like

CI/CD pipelines are fast, stable, and trusted.
Platform infrastructure becomes more automated, observable, and scalable.
Telemetry and dashboards provide clear visibility into system health.
Deployments are consistent, safe, and repeatable.
Engineering teams move faster thanks to strong platform foundations.
Hybrid AI platform becomes increasingly reliable, efficient, and easy to operate.

Ready to apply?

You'll be redirected to SDI International's application page.

Compensation estimateAI

Similar roles