SoTalent Verified
Human Resources, Software, Artificial Intelligence, Recruiting Technology
Site Reliability Engineer
Houston, Texas, United StatesHybridFull TimePosted 1 month ago
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimateJob Title: Site Reliability Engineer
Location: Houston, Texas, United States
Type: Full Time
Our Client is seeking a skilled and proactive Site Reliability Engineer (SRE) to strengthen their platform reliability, optimize support processes, and drive continuous improvement across their technology ecosystem.
This is a hybrid position requiring an on-site presence 3 to 4 days per week. Please note that the number of days on-site can increase based on business needs.
What You’ll Do
- Develop and enhance the end‑to‑end process for managing support issues—from intake to resolution—working closely with senior team members.
- Participate in and occasionally lead strategic initiatives focused on improving the flexibility, scalability, and long‑term sustainability of core products.
- Collaborate with cross‑functional teams including Level 1 Support, Engineering, DevOps, and customer stakeholders to drive operational improvements.
- Work closely with offshore SRE teams as well as global support management.
- Conduct thorough root cause analyses for critical incidents to reduce recurring issues and strengthen system stability.
- Maintain strong knowledge of the system architecture, application ecosystem, and integrations; partner with platform teams to enhance monitoring and alerting.
- Support technology evaluations, process improvements, compliance alignment, and strategic planning under the direction of the SRE Manager.
Minimum Qualifications
- Bachelor’s degree in Computer Science, Engineering, or related field; equivalent practical experience is also welcomed.
- 3+ years in a technical operations or support-focused role.
- 3+ years working with enterprise cloud environments.
- Ability to work extended or off-cycle hours and participate in a 24/7 on‑call rotation.
Preferred Qualifications
- 5+ years of technical operations or support experience.
- Experience with AWS cloud services.
- Familiarity with APM tools such as Datadog, New Relic, Nagios, or Splunk.
- Background in agile development environments.
Similar roles
- Site Reliability EngineerPacer Group · Montreal, Quebec, Canada · Hybrid
Senior Site Reliability EngineerBasis Theory · United States · Remote- Senior Site Reliability EngineerBlock Inc · New York, New York, United States · Remote
- Senior Site Reliability EngineerBlock Inc · Bay, California, United States · Remote
- Senior Site Reliability EngineerUplink · United States · Hybrid