We're in beta · Starting with US & Canada · Shipping weekly — your feedback shapes RiseMe
3|SHARE logo
3|SHARE Verified
IT Consulting, Digital Marketing, Software Services, Adobe Solutions

Site Reliability Engineer

Santa Clara, California, United StatesHybridFull TimePosted 1 month agoVisa sponsorship available

Compensation estimateAI

See base, equity, bonus, and total comp estimates for this role — free, no credit card.

Sign up to see compensation estimate

Company Description

3|SHARE, a member of Publicis Groupe, empowers organizations to maximize their digital ecosystems through expertise in content supply chain strategy, modern CMS implementations, AI integration, and cloud operations. With a global presence across Latin America, North America, EMEA, and Asia from our headquarters in Boston, we collaborate with clients across diverse industries such as technology, pharmaceutical, manufacturing, consumer products and travel & hospitality. As a trusted partner to world-renowned brands, we have successfully delivered thousands of projects in our 15 year existence. Our mission is to drive efficiency, scalability, and innovation to support the evolving needs of digital teams and help them build the future.

Role Description

This is a full-time, on-site (open to partial remote AFTER an appropriate amount of time) role
for a for a Site Reliability Engineer with our client, a leading Ai technology leader
based in Santa Clara, CA. The Site Reliability Engineer will be responsible for ensuring the reliability, availability, and performance of critical systems. Day-to-day tasks include maintaining and optimizing system infrastructure, troubleshooting issues, developing scalable software solutions, and performing system administration. The engineer will work closely with various teams to automate processes, enhance monitoring capabilities, and improve infrastructure reliability.

What you will be doing:

  • Rapidly debug and triage user-reported issues on the Digital Marketing Organization.
  • On-board new applications and services on AWS Infrastructure
  • Make valuable contribution to the overall health, performance, and uptime of our services running in Linux and Windows.
  • Implement monitors, alerts and SOPs to ensure early detection, and accurate response to service-impacting issues.
  • Taking ownership of automating, scripting, and tooling of new/existing scripts to help the team achieve 100% automation of daily tasks

What we need to see:

  • MS or BS in Computer Science/Engineering or a related field or equivalent practical experience.
  • You will need 2 +years’ experience supporting technical operations in a live-site production environment with a real passion for automation and tooling.
  • AWS Certifications preferred.
  • Built and ran critical production services packaged or custom (Java/PHP) on Windows or Linux.
  • Strong knowledge of AWS Kubernetes and AWS WAF, CI/CD Pipeline buildout
  • Strong knowledge of AWS CDN or Akamai CDN configurations
  • Make valuable contributions to the incident management process for early detection of all service-impacting issues, accurate triage, partner communication, impact containment, service restoration, and post-incident follow-up.
  • Experience with scripting and development in any of the languages (Python, bash, Lambda), fully automating the steps with a "one-click" rapid solution
  • Proven strengths in problem-solving and root causing issues, while continuously seeking ways to drive optimization, efficiency and the bottom line.

Ways to stand out from the crowd:

  • Previous experience with Zabbix, PagerDuty, Data Dog or Splunk like alerting and monitoring systems. Development, support, and scripting against these systems
  • Jenkins (Continuous Integration) setup, configuration, deployment is a requirement
  • JFROG Artifactory integrations with Jenkins
  • Strong Experience with AWS Cloud Platform, Terraform scripting, Lambda

ONLY DIRECT CANDIDATES PLEASE, NO RECRUITERS or AGENCIES

Ready to apply?
You'll be redirected to 3|SHARE's application page.

Similar roles