Recutify Inc. logo
Recutify Inc. Verified
Human Resources Technology (HR Tech), Artificial Intelligence, Software

AI/ML Infrastructure Engineer

Ontario, CanadaRemoteFull TimePosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

We are seeking an experienced AI/ML Infrastructure Engineer to design, automate, and operate scalable cloud infrastructure on GCP and Azure. This role involves managing infrastructure as code with Terraform, building CI/CD pipelines using Azure DevOps, and provisioning/managing cloud networking. You will support data platforms and AI/ML workloads, including GPU resources, and implement security best practices. The ideal candidate has strong Linux and scripting skills, and experience with cloud networking and data platforms. Experience with Azure infrastructure, Kubernetes, and monitoring tools is preferred.

AI/ML Infrastructure Engineer

Canada(Remote)

Full-time (Permanent)

JD:

We are looking to hire an Skilled AI/ML Infrastructure Engineer immediately

Role Overview:

We are looking for an experienced Infrastructure Engineer to design, automate, and operate scalable cloud infrastructure supporting data platforms and AI/ML workloads across GCP and Azure. This role focuses on Infrastructure such as Code, CI/CD automation, cloud networking, and enabling reliable, secure environments for data engineering and analytics teams.

Key Responsibilities:

  • Design, provision, and manage cloud infrastructure using Terraform
  • Build and maintain CI/CD pipelines using Azure DevOps
  • Provision and manage GCP infrastructure, including compute, storage, IAM, and networking
  • Support and manage Azure infrastructure (VNets, networking, compute, storage)
  • Design and implement network provisioning (VPC/VNet architecture, routing, firewalls, load balancers, private connectivity)
  • Build and operate infrastructure for data platforms (data lakes, warehouses, streaming, analytics platforms)
  • Provision and support AI/ML infrastructure, including GPU resources and AI platforms
  • Implement security best practices, IAM, encryption, and compliance controls
  • Optimize infrastructure for performance, reliability, and cost
  • Collaborate with data engineering, analytics, and ML teams
  • Document infrastructure, architecture, standards, and operational runbooks

Required Skills & Qualifications:

  • Strong experience with Terraform (Infrastructure as Code)
  • Experience with CI/CD pipelines, preferably Azure DevOps
  • Strong hands on experience with Google Cloud Platform (GCP)
  • Solid understanding of cloud networking and network provisioning
  • Experience supporting data platforms or large scale data workloads
  • Experience with AI/ML infrastructure
  • Strong Linux and scripting skills (Bash, Python, etc.)

Preferred / Nice to Have:

  • Hands on experience with Azure infrastructure
  • Experience with Kubernetes (GKE / AKS)
  • Experience with data services such as BigQuery, Dataflow, Dataproc, Synapse, ADLS, Snowflake
  • Monitoring and observability tools (Prometheus, Grafana, Cloud Monitoring)
  • Multi cloud experience and relevant certifications
Ready to apply?
You'll be redirected to Recutify Inc.'s application page.

Similar roles