Intercontinental Exchange logo
Intercontinental Exchange Verified
Financial Services, FinTech, Stock Exchange, Data Services

Senior Site Reliability Engineer

Jacksonville, Florida, United StatesOnsiteFull TimeSeniorPosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

The Senior Site Reliability Engineer will support ST Application services, focusing on deployment and incident management. Responsibilities include building automation for incident prevention, performance bottleneck detection, and maintenance activities. The role requires deep troubleshooting skills to enhance the availability, performance, and security of IMT Services. Key tasks involve coding and automation on Linux, Windows, and Cloud Platforms, implementing automated tests and deployments, and collaborating with Product and Support teams. The engineer will also define non-functional requirements, contribute to product development for Quality of Service, and partner with other SREs, leading by example. Expertise in monitoring, alerting, incident response, and root cause analysis is essential.

Overview:

Job Purpose

SRE new headcount to assist with day-to-day activities supporting ST Application services related to deployment and incident management. Build actionable alerts/automation for preventing incidents, detecting performance bottlenecks, and identifying maintenance activities.

Responsibilities

  • Employ deep troubleshooting skills to improve the availability, performance, and security of IMT Services.
  • Coding and Automation of Applications on Linux, Windows, Cloud Platforms
  • Implement automated tests, automated deployments, and operational tools
  • Collaborate with Product and Support teams to plan and deploy product releases
  • Work with Linux, Windows, Cloud Platforms and Operations leaders to develop narratives, backlog grooming, epic planning, and overall sprint planning processes
  • Work with Engineering leadership to build shared services that meet the requirements and need of the platform and application teams
  • Ensure services are designed with 24/7 availability and operational readiness and rigor
  • Implementation of proactive monitoring, alerting, trend analysis and self-healing systems
  • Define non-functional requirements as part of the product lifecycle to influence the new designs, standards, and methods for scalable, highly available distributed systems
  • Contribute to product development / engineering as needed to ensure Quality of Service of Highly Available services
  • Identify, evaluate, and execute preventive measures to minimize/avoid impact to the customers experience. Proactive v/s Customer escalated
  • Resolution of product/service defects or design changes, infrastructure changes, or operational changes
  • Partner with other SREs and lead by example - contributor more than a delegator
  • Develop partnership-oriented relationships with business executives and functional leaders, especially as it relates to operations and technology

Knowledge and Experience

  • 7+ years of Systems/Applications automation in 24x7 Production support services environments
  • BS in Computer Science, Computer Engineering, Math, or equivalent professional experience
  • Fluency with one or more current generation scripting language (Python/Shell/Perl/ PHP/Ruby) AND/OR Java Development and/or .NET
  • 7+ years managing Enterprise Red Hat Linux experience required
  • Excellent troubleshooting skills, utilizing a systematic problem-solving approach
  • Demonstrated experience in designing, analysing, and diagnosing large-scale distributed systems + Windows Server and/or Linux systems internals (system libraries, file systems, client-server protocols)
  • Experience with elastically scalable, fault tolerance and other cloud architecture patterns
  • Experience with Continuous Integration and Continuous Delivery concepts
  • Good to have experience in Containerization concepts like Kubernetes
  • Proven strength in SaaS services, experience in massive scale web operations
  • Must be able to multitask in a fast-paced environment with focus on timeliness, documentation, and communications with peers and business users alike
  • Expertise with monitoring, alerting and incident response tools and performing root cause analysis
  • Experience with deployment automation tools like UCD
  • Experience with Azure DevOps (ADO)

#LI-JM1

-: Intercontinental Exchange, Inc. is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to legally protected characteristics.

Ready to apply?
You'll be redirected to Intercontinental Exchange's application page.

Similar roles