AI / ML Platform Engineer
Role summary
This role focuses on building and operating the AI layer of a system, specifically managing the infrastructure for Large Language Model (LLM) workloads and inference. The engineer will be responsible for deploying LLM serving frameworks like vLLM, configuring gateways such as LiteLLM, and managing GPU node pools. Key duties include implementing and managing MLflow for model registry and versioning, optimizing inference performance, and monitoring model workloads. Experience with GPU infrastructure, CUDA, AI workload scaling, and vector retrieval systems is essential for this 12-month contract position.
Job Title: AI / ML Platform Engineer
Work Location : Canada
Contract duration: 12 months
3.
This team operates the AI layer of the system.
They build and run the infrastructure needed for LLM workloads and inference.
Responsibilities
• deploy vLLM model serving
• configure LiteLLM gateway
• manage GPU node pools
• implement MLflow model registry
• manage model versioning
• optimize inference performance
• monitor model workloads
Skills
• LLM serving frameworks (vLLM, Triton)
• GPU infrastructure
• CUDA / inference optimization
• MLflow
• AI workload scaling
• vector retrieval systems
• model lifecycle management