We're in beta · Starting with US & Canada · Shipping weekly — your feedback shapes RiseMe
North Star Group logo
North Star Group Verified
Government Contracting, Cybersecurity, IT Services

Site Reliability Engineer @ GA- ATL - Only Locals

Atlanta, Georgia, United StatesOnsiteFull TimePosted 2 months agoVisa sponsorship available

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Sign up to see compensation estimate

Looking for a highly skilled and experienced Site Reliability Engineer (SRE) with strong background in monitoring and alerting systems, particularly using Splunk. The primary focus of this role will be to reduce onscreen monitoring, ensuring only actionable alerts are in place, and implementing proactive alerting mechanisms.
Responsibilities

  • Develop and refine alerting strategies to minimize onscreen monitoring and focus on actionable alerts.
  • Implement proactive alerting mechanisms to identify and address potential issues before they impact the system.
  • Collaborate with other IT teams to ensure reliability and performance of applications.
  • Automate repetitive tasks to improve efficiency and reduce manual intervention.
  • Conduct root cause analysis and post-mortem reviews to prevent recurrence of issues.
  • Continuously improve monitoring and alerting processes to enhance system reliability.

Requirements

  • 5+ years of Core experience in Site Reliability Engineering Principals
  • Develop and implement automation tools and processes to improve efficiency, reduce downtime, and enhance system reliability.
  • Monitor and troubleshoot system issues, identifying root causes and implementing fixes.
  • Builds, modifies, and monitors real time dashboards
  • Design & Implements defined SLIs & SLOs
  • Assists in triaging and resolution process as needed
  • Identifies use cases for toil reduction through detection and resolution
  • Proposes changes to improve observability and assist engineers in implementation
  • Strong scripting and automation skills
  • Ability to work collaboratively in a team environment.
  • Strong communication skills to effectively convey technical concepts to non-technical stakeholders. "

Core SRE,Splunk,management,operational management,risk analysis,risk management,safety analysis

Ready to apply?
You'll be redirected to North Star Group's application page.