Site Reliability Engineer

Name: RiseMe
Availability: InStock

Toronto, Ontario, CanadaOnsiteContractPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

We are seeking an experienced Site Reliability Engineer (SRE) with over 8 years of experience to support and scale the infrastructure for GenAI applications, including training, inference, and model serving. The role involves managing and automating cloud infrastructure and GPU clusters, defining SLOs/SLAs, implementing monitoring and incident response, and optimizing performance, scalability, and cost. Key technical skills include Kubernetes, Docker, IaC (Terraform, Helm), scripting (Python, Go, Java), monitoring tools (Prometheus, Grafana, ELK, Datadog), and a strong understanding of networking and system engineering fundamentals. Experience with AI/ML infrastructure and regulated environments is a plus.

Title :Site Reliability Engineer (SRE) – GenAI Platform

Location: Toronto , ON

Duration: Long term

We’re looking for an experienced
SRE (8+ yrs)
to support and scale infrastructure for
GenAI applications
(training, inference, model serving).

🔹
Key Skills:

• SRE / Infrastructure Ops for large-scale systems

• Kubernetes, Docker & IaC (Terraform, Helm, etc.)

• Strong scripting (Python, Go, Java)

• Monitoring tools (Prometheus, Grafana, ELK, Datadog)

• Networking + system engineering fundamentals

🔹
What You’ll Do:

• Manage and automate cloud infrastructure & GPU clusters

• Define SLOs/SLAs, monitoring, and incident response (RCA)

• Optimize performance, scalability & cost

• Drive reliability, security, and disaster recovery strategies

⭐ Nice: AI/ML infra, regulated environments (Finance/Security)

#Hiring #SRE #Kubernetes #DevOps #GenAI #Cloud #Reliability

Ready to apply?

You'll be redirected to Apptoza Inc.'s application page.

Is this role right for you?

Role summary

Similar roles