
Rancher Platform Engineer
Role summary
We are seeking a Rancher Platform Engineer to design, deploy, and operate Rancher-managed Kubernetes clusters in on-prem and hybrid environments. This role involves architecting highly available, scalable, and secure Kubernetes platforms, implementing multi-cluster governance, and integrating with enterprise identity providers. You will manage and troubleshoot cluster components, implement observability stacks, and support application onboarding. The ideal candidate will have strong expertise in Rancher, Kubernetes internals, networking, storage, CI/CD, and GitOps workflows, with excellent troubleshooting and communication skills.
Mandatory Skills : Rancher-managed Kubernetes clusters (RKE / RKE2), Rancher UI, APIs, and automation workflows, Networking, Observability stacks including Prometheus, Grafana, and centralized logging (EFK/ELK), CI/CD and GitOps workflows (Helm, Jenkins, GitHub Actions, Argo CD)
Key Responsibilities
Design, deploy, and operate Rancher-managed Kubernetes clusters (RKE / RKE2) across on‑prem and hybrid environments.
Architect and maintain highly available, scalable, and secure Kubernetes platforms using Rancher best practices.
Install, configure, upgrade, patch, and decommission Kubernetes clusters using Rancher UI, APIs, and automation workflows.
Implement multi-cluster governance using Rancher Projects, Namespaces, RBAC, and global policies.
Integrate Rancher with enterprise identity providers (Active Directory / LDAP / Azure AD / SSO).
Manage and troubleshoot control plane, etcd, node, networking (CNI), and storage (CSI) issues in production clusters.
Perform root cause analysis (RCA) for cluster outages and platform incidents, and implement preventive improvements.
Implement and maintain observability stacks including Prometheus, Grafana, and centralized logging (EFK/ELK).
Support application onboarding, Helm-based deployments, and standardized platform patterns for development teams.
Collaborate closely with security, infrastructure, and application teams to enforce platform security and compliance.
Act as a Rancher SME, providing hands-on guidance, troubleshooting support, and architectural recommendations.
Create and maintain architecture diagrams, operational runbooks, standards, and platform documentation.
Support OpenShift clusters as needed, primarily from an integration and interoperability perspective.
Required Skills & Experience
Strong hands-on expertise with Rancher managing Kubernetes clusters using RKE and RKE2 (mandatory).
Deep understanding of Kubernetes architecture, internals, and day‑2 operations
- Proven experience managing:
- HA control planes and etcd
- Node lifecycle (provisioning, scaling, replacement, decommissioning)
- Kubernetes upgrades and patching with minimal downtime
- Experience with Kubernetes networking (Calico, Cilium, ingress controllers, load balancing).
- Experience with persistent storage and CSI drivers (NFS, cloud disks, Ceph, Longhorn, etc.).
- Hands-on experience with CI/CD and GitOps workflows (Helm, Jenkins, GitHub Actions, Argo CD).
- Experience supporting mixed Linux and Windows Kubernetes worker nodes.
- Strong troubleshooting skills with the ability to work through complex, production-impacting issues.
- Excellent communication skills, with the ability to explain technical concepts to both technical and non-technical stakeholders.
- Ability to work independently while also collaborating effectively within cross-functional teams.
- Experience with Red Hat OpenShift, particularly in environments environments (banking, healthcare, pharma).
Email: ravinder@shrivetechnologies.com