SITE RELIABILITY ENGINEER
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimateJob Description:
Ignite is an ISO 9001:2015 and CMMI Services Level 3 certified, Service-Disabled Veteran-Owned Small Business (SDVOSB), headquartered in Huntsville, AL. By design, Ignite is a provider of professional services to customers in educational, federal, and commercial industries and in every action seeks to be the preeminent provider within this business space. Ignite upholds our values of competency, collaboration, innovation, reliability, and results through everything we do.
Ignite is currently seeking driven, detail-oriented site reliability engineer Ignite is currently seeking a driven, detail-oriented Site Reliability Engineer (SRE) to ensure the reliability, performance, and operational resilience of mission-critical software systems. This role focuses on defining reliability standards from the user perspective, instrumenting systems to measure performance against those standards, and building the tooling, automation, and operational processes that make systems resilient and recoverable. The SRE will work closely with development teams to improve operational quality early in the development lifecycle, ensuring systems are designed, tested, and deployed with reliability in mind. When production issues occur, the SRE will lead incident resolution, diagnose distributed system failures, and translate operational findings into long-term reliability improvements. This position can be filled in Dayton, OH, Huntsville, AL, or St. Louis, MO. Contingent on contract award.
Specialty Skills: (1 or more)
- - Platform & Infrastructure- Kubernetes, ArgoCD/GitOps, disaster recovery, capacity planning
- Observability - OTel standards, Grafana/Perses, Tempo, Clickhouse, VictoriaMetrics
- Automation & Toil Reduction- scripting, CI/CD, runbook automation, “DevOps”
- Developer Enablement- instrumentation SDKs, SRE practice onboarding
- Data & Alerting- dashboard quality, alert design, anomaly detection
Job Requirements:
Job Requirements and Qualifications:
- - 1-3 years of experience in Operations, Sys Admin, DevOps, or Software engineering
- Bachelor’s Degree in CS, Computer Engineering, or related technical field
- US Citizenship & must have or be able to obtain a Top Secret Clearence
- Systems thinking – understanding how systems fail together, blast radius, and more
- Observability Fundamentals – not just the 3 signals, but knowing why and how to use telemetry to optimize services and engineering quality of life
- Basic software engineering – building automation & non-trivial APIs, git workflows, effectively engaging in code reviews
- Linux/networking fundamentals
- Strong Communication, Collaboration, and Organizational Skills
Preferred Qualifications:
- - - SRE Certifications from The DevOps Institute, AWS Solution Architect, or similar
- Hands-on experience with: Python, Go, Kubernetes, Argo CD, GitLab/GitHub, Jenkins, Docker, Locust/Gatling, Prometheus, Grafana/Perses
Security Clearance Requirements:
Must have an active TS/SCI Security Clearance or the ability to obtain one.
Education Requirements:
- Bachelor’s degree in relevant discipline.
Other Requirements:
Must be a US citizen and be able to obtain and hold an active Security Clearance