Site Reliability Engineering (SRE) Architect
Role summary
We are seeking an experienced Site Reliability Engineering (SRE) Architect for an onsite contract role in Atlanta, GA. The ideal candidate will have 15 years of overall experience, with a strong background in architectural design focused on reliability, scalability, and performance. You will leverage deep SRE principles, expertise in cloud platforms like AWS, and proficiency in containerization (Kubernetes, Docker) and observability tools (Prometheus, Grafana, ELK). Strong programming skills in Python, Go, or Bash for automation are essential. This role requires excellent analytical, communication, and leadership skills to influence technical direction.
Site Reliability Engineering (SRE) Architect in Atlanta, GA\_ Onsite Contract
Required Qualifications:
Overall 15 years of experience required
Proven experience in an architectural role, designing solutions for reliability, scalability, and performance
Deep understanding and practical application of SRE principles (SLIs/SLOs, error budgets, toil reduction, automation, incident management, postmortems)
Expertise in cloud computing platforms (e.g., AWS) including infrastructure, networking, and security services
Strong experience with containerization and orchestration technologies (Kubernetes, Docker, serverless computing)
Solid experience designing and implementing observability solutions (e.g., Dynatrace, Prometheus, Grafana, ELK/EFK Stack, Jaeger, OpenTelemetry)
Strong programming/scripting skills (e.g., Python, Go, Bash) for automation and tool development
Excellent analytical, problem-solving, and strategic thinking skills.
Strong communication, collaboration, and leadership skills with the ability to influence technical direction across teams
Preferred Qualifications:
Experience designing and implementing chaos engineering practices and platforms.
Please share below details
Attach your resume ::
Work Authorization ::
Best Hourly rate or Employer Details ::