Senior SRE
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimateCompany Description
Inphra.ai is a technology-driven company that provides innovative solutions to optimize system performance, reliability, and scalability. Our mission is to empower organizations with cutting-edge tools and practices that improve infrastructure management and operational efficiency. We are a remote-first company, fostering collaboration, innovation, and flexibility among our global team of experts. At inphra.ai, we are dedicated to developing a culture of continuous learning and problem-solving to drive impactful results.
Role Description
This is a full-time remote role for a Senior Site Reliability Engineer (SRE). The Senior SRE will be responsible for ensuring the reliability, scalability, and performance of the company's infrastructure. Day-to-day responsibilities include monitoring and maintaining system performance, optimizing system architecture, diagnosing and troubleshooting technical issues, automating operational tasks, and collaborating with cross-functional teams to enhance software development and system performance. The role will also involve designing long-term solutions to complex infrastructure challenges and contributing to incident resolution and post-mortems.
Qualifications
- Proficiency in Site Reliability Engineering principles, with experience implementing and operating reliable and scalable systems
- Strong troubleshooting and problem-solving skills to diagnose and resolve technical issues effectively
- Experience in software development with proficiency in one or more programming languages such as Python, Go, Java, or similar
- In-depth knowledge of System Administration, including Linux/Unix systems, networking, and cloud platforms
- Expertise in Infrastructure management, such as configuration management, Infrastructure as Code (IaC), and container orchestration (e.g., Kubernetes)
- Hands-on experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack)
- Proven ability to work independently in a remote environment while collaborating effectively across teams
- Excellent written and verbal communication skills for effective documentation and stakeholder communication
- Experience in agile development and DevOps practices is a plus
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent professional experience)