Senior AI Platform Engineer
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimateTitle: Senior Platform Engineer
Location:
Toronto, ON(Hybrid)
Contract
We are seeking a highly skilled AI Platform Engineer to join our AI Enablement team. In this role, you will be responsible for building, maintaining, and scaling our enterprise AI infrastructure, including our proprietary agent orchestration platform (NOVA), AI gateway services, and Retrieval-Augmented Generation (RAG) pipelines across multi-cloud environments.
You will work at the intersection of platform engineering, MLOps, and Agentic AI—enabling teams across the organization to leverage cutting-edge AI capabilities through robust, scalable, and secure infrastructure.
Key Responsibilities – Platform Development & Operations
• Develop, deploy, and maintain the NOVA agentic AI platform
• Manage LiteLLM as the central AI gateway
• Optimize LLM routing, cost control, load balancing, and failover
• Implement monitoring and observability using Prometheus, Grafana, OpenTelemetry
RAG Pipeline Development
• Design and optimize Retrieval-Augmented Generation (RAG) pipelines
• Maintain document ingestion, chunking, embeddings, and vector stores
• Build RAG on GCP and Azure using managed AI services and vector databases
Infrastructure & DevOps
• Deploy AI services on Kubernetes (AKS, GKE)
• Implement CI/CD with Jenkins, Opsera, GitHub Actions
• Automate infrastructure using Terraform, Helm, GitOps
• Ensure security and compliance
Agentic AI & Automation
• Develop automation tools and scripts
• Build MCP servers for tool integrations
• Enable multi-agent orchestration and autonomous workflows
• Create SDKs, APIs, and developer documentation
Required Qualifications
• 5+ years platform engineering / DevOps experience
• 2+ years AI/ML or LLM platform experience
• Strong Kubernetes, CI/CD, and cloud experience (GCP or Azure)
• Proficiency in Python and/or TypeScript
Preferred Qualifications
• Experience with LangChain, LlamaIndex, or agent frameworks
• Familiarity with LiteLLM, MCP, Backstage
• Cost optimization for LLM workloads
• Enterprise-scale AI platform experience
Technical Environment
AI Platforms:LiteLLM, LangChain, LangGraph
Cloud: GCP, Azure
Containers: Kubernetes, Docker, Helm
CI/CD: Jenkins, GitHub Actions, Opsera
Observability: Prometheus, Grafana, OpenTelemetry, Dynatrace
Languages: Python, TypeScript, Bash
Similar roles
- Staff AI Platform EngineerBlueSky Resource Solutions · United States · Remote
- Senior AI Platform EngineerStand8 Technology Consulting · Carrollton, Texas, United States · Onsite
- Sr AI Platform EngineerVLink Inc · Palo Alto, California, United States · Onsite
- Senior AI Platform EngineerJobs via Dice · Carrollton, Texas, United States · Onsite
- AI Platform EngineerFractal · United States · Onsite