
Principal System Engineer - (SRE-CCS)
Join AT&T and help shape the future of communications and technology that connect the world. We value innovators who seek to explore the unknown and challenge the status quo. Bring your bold ideas and fearless spirit to redefine connectivity and transform how people share stories and experiences. At AT&T, you won’t just imagine the future—you’ll build it.
Principal System Engineer - Converged Charging Systems (SRE-CCS)
NOTE:
For this role you
must have current Amdocs experience in Converged Charging Systems (CCS) .
The selected candidate will have experience & skills that includes a hybrid of traditional T2/SRE operations technical skills to support our
Converged Charging Systems (CCS) and One Mediation apps.
Additionally, we need
new, evolving
Generative AI
and Workflow Automation skillsets, to drive operational efficiency and scalability.
- As a Tier 2/Site Reliability Engineer (SRE), you will translate core business requirements into robust, scalable, and reliable technical solutions. You’ll play a pivotal role in designing and implementing applications, platforms, and services that power critical business operations, with a strong emphasis on high availability, performance, and compliance in cloud, messaging, and data environments.
- Individual will possess the experience & skills that includes a hybrid of traditional T2/SRE operations technical skills to support our
Converged Charging Systems (CCS) and One Mediation apps
and new, evolving Generative AI and Workflow Automation skillsets needed to drive operational efficiency and scalability.
- Provide technical expertise and best practices for Java, Python, JavaScript, and Perl-based solutions.
- Practical understanding of AI/ML concepts and their integration in enterprise platforms.
Key Responsibilities
- The EngOps Tier 2/SRE team ensures applications and systems are highly reliable, scalable, and performant while fostering a collaborative culture between development and operations.
- Work with T1 team on incident as Triage lead during outages or critical issues Pager duty issues
- Minimize downtime and user impact during incidents.
- Conduct detailed After Action Reviews involving all stakeholders and chalk out short term and long-term resiliency options.
- Eliminate recurrence of similar issues through systemic fixes.
- Define and implement monitoring and alerting strategies tailored to the launch.
- Collaborate with Product development teams to gain deep insight into the application architecture, flows and critical dependencies.
- Monitor and evaluate key performance metrics like latency, throughput, and error rates and update alerts
- Propose architectural or operational changes to prevent reoccurrence
- Reduce Mean Time to Resolution (MTTR) for incidents.
Required Qualifications
MUST HAVE:
For this role you must have current
Amdocs
experience. Individual must possess the experience & skills that includes a hybrid of traditional T2/SRE operations technical skills to support our
Converged Charging Systems (CCS) and One Mediation apps
and new, evolving
Generative AI
and Workflow Automation skillsets needed to drive operational efficiency and scalability.
- Experience
: Over 10+ years hands-on experience in architecting and building scalable platforms and applications in cloud/data SRE environments.
- Expert level experience using Python, Java, Javascript, and Perl based solutions in a SRE role.
- Practical understanding of
AI/ML concepts
and their integration in enterprise platforms.
- Education
: Bachelor’s degree in computer science, Information Systems, or a related discipline.
This position requires office presence of a minimum of 5 days per week and is only located in the location(s) posted. No relocation is offered.
Supervisory:
No