We're in beta · Starting with US & Canada · Shipping weekly — your feedback shapes RiseMe
York Digital Consulting Inc. logo
York Digital Consulting Inc. Verified
Information Technology

Site Reliability Engineer

Toronto, Ontario, CanadaOnsiteContractPosted 2 months ago

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Sign up to see compensation estimate

Job Title: Site Reliability Engineer (SRE)

Role Overview:

We are seeking a skilled
Site Reliability Engineer (SRE)
responsible for ensuring the reliability, scalability, and performance of production systems. The ideal candidate will work closely with development and operations teams to automate processes, monitor system health, and quickly resolve production issues while continuously improving system reliability.

Key Responsibilities:

Operations & Incident Management

  • Monitor production systems and applications to ensure reliability and performance.
  • Respond to emergency incidents and perform root cause analysis.
  • Manage system changes through established change management processes.
  • Support IT infrastructure operations and ensure system stability.
  • Implement automation tools to streamline operational tasks and improve efficiency.

System Support & Collaboration

  • Work closely with development teams to support the deployment of new features.
  • Assist in stabilizing production environments and resolving escalated issues.
  • Develop and maintain SRE processes for the engineering team.
  • Provide documentation and procedures for customer support teams to help resolve technical issues.

Process Improvement

  • Conduct post-incident reviews and implement improvements to prevent recurring issues.
  • Maintain a knowledge base documenting system problems, resolutions, and best practices.
  • Continuously improve the software development lifecycle and operational processes.

Required Skills & Technologies

- Cloud Platforms:
GCP and AWS
- Infrastructure as Code:
Terraform
- Version Control:
GitHub
- Scripting/Programming:
Python
- Project & Documentation Tools:
JIRA and Confluence
- Experience with automation and monitoring tools
- Strong troubleshooting and problem-solving skills

Preferred Background

- Experience as a
System Administrator, DevOps Engineer, or Operations Engineer
- Strong understanding of
production systems, infrastructure management, and automation

Ready to apply?
You'll be redirected to York Digital Consulting Inc.'s application page.

Similar roles