Recruitment Staff at Lion Group | Tech & Non-tech | Always Hiring!
Direct message the job poster from BookCabin
Responsibilities
- Design, build, and maintain scalable, reliable, and secure infrastructure across AWS (including Elastic Beanstalk) and Azure.
- Develop and manage CI / CD pipelines using Azure DevOps, GitHub Actions, or similar tools to ensure smooth and automated deployments.
- Operate, monitor, and troubleshoot Kubernetes clusters (EKS, AKS, or self-managed) to ensure system stability and uptime.
- Implement comprehensive observability solutions using Prometheus, Grafana, Loki, and Alertmanager.
- Automate infrastructure provisioning and configuration using Terraform, Helm, CloudFormation, and / or Ansible.
- Define, measure, and improve system reliability through SLOs, SLIs, and SLAs.
- Enhance system resilience and incident response through proactive monitoring and capacity planning.
- Manage secrets, access control, and security policies to maintain a robust and compliant infrastructure.
- Participate in on-call rotations, respond to incidents, and drive root cause analysis and post-incident reviews.
- Collaborate closely with development teams to embed reliability and scalability best practices throughout the software lifecycle.
Requirements
5+ years of experience in a Site Reliability, DevOps, or Cloud Engineering role.Strong hands-on experience with AWS (EC2, VPC, IAM, CloudWatch, Elastic Beanstalk, RDS, S3) and familiarity with Azure services.Proven experience deploying and managing containerized applications using Kubernetes (EKS / AKS) and Docker.Skilled in CI / CD pipeline development and multi-cloud workflows (Azure DevOps, GitHub Actions, etc.).Solid understanding of observability tools such as Prometheus, Grafana, Loki, and Alertmanager.Proficiency in infrastructure-as-code tools like Terraform, CloudFormation, or similar.Scripting skills in Bash, Python, or PowerShell.Strong grasp of networking, Linux systems, and cloud security best practices.Excellent problem-solving skills with a focus on performance, scalability, and reliability.Seniority Level
Mid-Senior level
Employment Type
Contract
Job Function & Industries
Engineering and Information Technology
IT Services and IT Consulting
Referrals increase your chances of interviewing at BookCabin by 2x
Sign in to set job alerts for “Site Reliability Engineer” roles.
#J-18808-Ljbffr