AI / ML Platform Engineer

Toronto, Ontario, CanadaOnsiteContractPosted 2 months ago

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

This role focuses on building and operating the AI layer of a system, specifically managing the infrastructure for Large Language Model (LLM) workloads and inference. The engineer will be responsible for deploying LLM serving frameworks like vLLM, configuring gateways such as LiteLLM, and managing GPU node pools. Key duties include implementing and managing MLflow for model registry and versioning, optimizing inference performance, and monitoring model workloads. Experience with GPU infrastructure, CUDA, AI workload scaling, and vector retrieval systems is essential for this 12-month contract position.

Job Title: AI / ML Platform Engineer

Work Location : Canada

Contract duration: 12 months

This team operates the AI layer of the system.

They build and run the infrastructure needed for LLM workloads and inference.

Responsibilities

• deploy vLLM model serving

• configure LiteLLM gateway

• manage GPU node pools

• implement MLflow model registry

• manage model versioning

• optimize inference performance

• monitor model workloads

Skills

• LLM serving frameworks (vLLM, Triton)

• GPU infrastructure

• CUDA / inference optimization

• MLflow

• AI workload scaling

• vector retrieval systems

• model lifecycle management

Ready to apply?

You'll be redirected to BURGEON IT SERVICES's application page.