Lead Site Reliability Engineer - Remote
Compensation estimateAI
See base, equity, bonus, and total comp estimates for this role — free, no credit card.
Sign up to see compensation estimateProfitSolv is a SaaS business services provider for the legal and accounting industry. We are looking for a Lead Site Reliability Engineer to join our growing team!
We are seeking a hands-on Lead Site Reliability Engineer to oversee our AWS cloud infrastructure and Microsoft SQL Server environments. This role leads a team of 2 DBAs and 2 AWS Cloud Engineers and serves as the primary owner of platform performance, reliability, and operational excellence
What we provide:
- Opportunity to Invest in Your Future. We offer a 401K match.
- Paid Time Off. Enjoy paid time off and paid holidays.
- Great Coverage. Take advantage of health, dental, and vision HSA and FSA policies.
- A Great Team. Collaborate with smart, curious, hardworking individuals.
- Performance Compensation. Be rewarded for your hard work with performance-based merits.
- Remote Work. Want to work from home? No problem!
As an AI Data Engineer, you will:
- Lead and manage a team of DBAs and AWS Cloud Engineers
- Own infrastructure performance, reliability, and system health
- Maintain and improve Apdex scores and overall platform performance
- Define and prioritize a backlog of performance, maintenance, and operational initiatives
- Oversee a distributed Microsoft .NET application running on EC2
- Manage and optimize Microsoft SQL Server environments across multiple servers
- Ensure high availability, fault tolerance, and rapid recovery capabilities
- Design and maintain disaster recovery strategies
- Proactively monitor systems using logs, alerts, and performance metrics
- Build and optimize CI/CD pipelines for reliable, automated deployments
- Oversee deployments with minimal downtime and safe rollback strategies
- Implement best practices for release management and version control
- Use New Relic to identify bottlenecks and drive performance improvements
- Deliver weekly system health and performance reports
- Manage team workload, schedules, and on-call rotations
- Other duties as assigned
This position follows established policies and procedures to keep confidential information secure.
A great fit for this position has:
- Strong hands-on experience with AWS (EC2, S3, Route 53, Load Balancing)
- Experience with Cloudflare for performance and security optimization
- Expertise in CI/CD tools (e.g., Azure DevOps, GitHub Actions, Jenkins)
- Strong knowledge of deployment strategies (blue-green, rolling, canary)
- Proficiency with New Relic or similar observability tools
- Experience supporting Microsoft .NET applications and SQL Server
- Deep understanding of performance tuning, scalability, high availability, and disaster recovery
- Proven leadership experience managing technical teams
- Excellent communication skills with both technical and non-technical stakeholders
Tech Stack
C#, .NET 8, ASP.NET Core, React/Angular/Vue, TypeScript, EF Core, gRPC, MS SQL Server, Aurora, DynamoDB, Redis, OpenSearch, AWS (ECS, Lambda, API Gateway, S3, CloudFront), Terraform/CDK, GitHub Actions, Serilog, OpenTelemetry, CloudWatch, X-Ray, AI productivity tools.
Additional Desirable Qualifications
- Ability to sit for prolonged periods at a desk and work on a computer.
- Must be able to lift up to 15 pounds at times.
- Ability to handle stress
- Ability to meet work deadlines
Our commitment to you: At ProfitSolv, we are committed to being a diverse and inclusive workplace as an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, age, national origin, protected veteran status, disability status, sexual orientation, gender identity or expression, marital status, genetic information, or any other characteristic protected by law. We embrace a diverse group of backgrounds and experiences to connect with clients, solve problems, and innovate.
Work location: Remote – U.S. only
Similar roles
- Senior Site Reliability Engineer - RemoteOptum · Eden Prairie, Minnesota, United States · Hybrid
- Sr. Site Reliability Engineer - RemoteRemoteHunter · United States · Remote
- Site Reliability Engineer - RemoteLensa · Reston, Virginia, United States · Remote
- Sr. Site Reliability Engineer - RemoteOptum · Eden Prairie, Minnesota, United States · Remote
- Principal Site Reliability Engineer - RemoteOptum · Eden Prairie, Minnesota, United States · Remote