REQ Solutions logo
REQ Solutions Verified
IT Consulting, Software Development, Digital Transformation, Cloud Services

Azure SRE Lead (Disaster Recovery/ Resiliency)

New York, New York, United StatesHybridContractLeadPosted 2 months agoVisa sponsorship available

Is this role right for you?

Upload your resume and get a skill-by-skill breakdown — see exactly where you match, where you're close, and what to highlight. Not a mystery percentage.

Get a tailored resume highlighting what this role needs.

Role summary

The Azure Cloud Resiliency & Disaster Recovery Lead is a 12-month contract role based in NYC, requiring 4 days onsite per week. This individual contributor position focuses on managing the organization's resiliency strategy and execution, including setting governance, driving disaster recovery (DR) and contingency capabilities, and ensuring readiness and rapid recovery. The role involves working with application developers to build DR plans, implementing regular DR exercises, and tailoring policies for Azure cloud environments with on-prem connectivity. While Azure experience is primary, experience with other cloud platforms like AWS or GCP is acceptable. The candidate must have recent experience with contingency plans and a strong background in SRE or related technical management fields.

Job Title- Azure Cloud Resiliency & Disaster Recovery Lead

Duration- 12 Months initial Contract & high possibility of extension

Location- Onsite 4 days per week in NYC office

Important Technical Notes-

  • Working with lead application developers to build out contingency plans (disaster recovery plans)
  • Implementing regular cadence for disaster recovery exercises
  • Tailored towards cloud hosting environments with connectivity back to on-prem
  • Will work within Azure but experience with any cloud-hosted environment is ok (AWS, GCP)
  • Individual contributor with no direct reports
  • SRE candidate could fit as long as they have recent experience with contingency plans
  • 2 rounds of interviews including an in-person interview

Description:

The Technical Execution Lead will manage the resiliency strategy and execution - setting governance, driving DR and contingency capabilities, and ensuring measurable readiness and rapid recovery.

Responsibilities:

  • Establish resiliency strategy and governance, including policy frameworks, standards, control objectives, and success metrics aligned to business impact and risk appetite.
  • Create disaster recovery roadmaps; ensure KPIs and recovery objectives (RPO/RTO) are defined, measured, and reported across platforms.
  • Oversee design, publication and ongoing maintenance of contingency planning policies and procedures tailored to an Azure public cloud environment; ensure adoption across lines of business and platforms.
  • Direct development of system-specific contingency plans, setting expectations for roles/responsibilities, recovery objectives, restoration priorities, failover patterns, and executive communication protocols.
  • Institutionalize a program for contingency training and exercises (tabletop, functional, and full-scale), including plan testing, after-action reviews, and remediation tracking with transparent ownership and target dates.
  • Set-up executive-level resiliency testing reporting and dashboards for readiness posture, drill performance, plan coverage, corrective action closure, and remediation backlog
  • Lead continuous improvement: analyze cross-platform lessons learned, prioritize enterprise-level remediation and investments, and update policies/standards accordingly.

Experience:

  • 10 + years in technical management, reliability/operations, or platform engineering with 7+ years leading resiliency/DR/BCP at enterprise scale in cloud environments.
  • Demonstrated experience designing and delivering Azure resiliency programs across multiple platforms/business lines, including architecture standards, program governance, and executive reporting.
  • Experience with Azure Site Recovery, Azure Backup, cross-region replication, and hybrid DR strategies, as well as leading enterprise-wide DR testing & cloud failover exercises.
  • Expertise in defining and enforcing KPIs/SLAs, recovery objectives (RPO/RTO), and control adherence; experience presenting to senior leadership, risk committees, and auditors/regulators.
  • Deep familiarity with standards and frameworks (e.g., NIST SP 800-34, ISO 22301, ISO 27031) and operational best practices (e.g., ITIL), with proven ability to translate them into actionable controls and operating procedures.
  • Exceptional stakeholder management, executive communication, and influence skills; able to align diverse teams and make decisions that balance risk, cost, and client impact.

Education:

  • Bachelor’s Degree in Computer Science, Information Systems, Information Technology, Engineering, or related field required; Master’s degree strongly preferred.
  • Advanced certifications are a plus (e.g., Azure Solutions Architect Expert, CBCP, ISO 22301 Lead Implementer, ITIL v4).
Ready to apply?
You'll be redirected to REQ Solutions's application page.