ITMC Systems, Inc Verified
Information Technology & Services
AI DevOps Engineer
CanadaOnsiteContractPosted 1 month ago
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimateThe AI Platform Engineer will join the AI Enablement team, focusing on building, maintaining, and scaling enterprise AI infrastructure. This includes proprietary agent orchestration platforms (NOVA), AI gateway services, and Retrieval-Augmented Generation (RAG) pipelines across multi-cloud environments.
Key Responsibilities:
- Platform Development & Operations:
- Develop, deploy, and maintain the NOVA agentic AI platform
- Manage LiteLLM as the central AI gateway
- Optimize LLM routing, cost control, load balancing, and failover
- Implement monitoring and observability (Prometheus, Grafana, OpenTelemetry)
- RAG Pipeline Development:
- Design and optimize RAG pipelines
- Maintain document ingestion, chunking, embeddings, and vector stores
- Build RAG on GCP and Azure using managed AI services and vector databases
- Infrastructure & DevOps:
- Deploy AI services on Kubernetes (AKS, GKE)
- Implement CI/CD with Jenkins, Opsera, GitHub Actions
- Automate infrastructure (Terraform, Helm, GitOps)
- Ensure security and compliance
- Agentic AI & Automation:
- Develop automation tools and scripts
- Build MCP servers for tool integrations
- Enable multi-agent orchestration and autonomous workflows
- Create SDKs, APIs, and developer documentation
Required Qualifications:
- 5+ years platform engineering/DevOps experience
- 2+ years AI/ML or LLM platform experience
- Strong Kubernetes, CI/CD, and cloud experience (GCP or Azure)
- Proficiency in Python and/or TypeScript
Preferred Qualifications:
- Experience with LangChain, LlamaIndex, or agent frameworks
- Familiarity with LiteLLM, MCP, Backstage
- Cost optimization for LLM workloads
- Enterprise-scale AI platform experience
Technical Environment:
- AI Platforms: LiteLLM, LangChain, LangGraph
- Cloud: GCP, Azure
- Containers: Kubernetes, Docker, Helm
- CI/CD: Jenkins, GitHub Actions, Opsera
- Observability: Prometheus, Grafana, OpenTelemetry, Dynatrace
- Languages: Python, TypeScript, Bash
Similar roles
- AI DevOps EngineerWall Street Consulting Services LLC · Warren Township, New Jersey, United States · Onsite
- AI DevOps EngineerSAP · Vancouver, British Columbia, Canada · Hybrid
AI DevOps EngineerBooz Allen Hamilton · Washington, District of Columbia, United States · Onsite- AI DevOps EngineerMoody's Corporation · New York, New York, United States · Onsite
- Sr. AI DevOps EngineerEnvision Technology Solutions · Westlake Village, California, United States · Onsite