AWS Site Reliability Engineer | Codersbrain
contractual
Posted on July 21, 2025
Job Description
AWS Site Reliability Engineer
Company Overview
Not specified.
Job Summary
The AWS Site Reliability Engineer will focus on maintaining and improving the infrastructure that supports critical cloud-based applications. This role is crucial for ensuring high availability, performance, and reliability of services hosted on AWS, directly contributing to the organization's operational goals.
Responsibilities
- Design, implement, and maintain infrastructure using AWS services to support application needs.
- Monitor system performance and reliability using Datadog and CloudWatch, ensuring optimal operation.
- Develop and maintain CI/CD pipelines using GitLab and GitHub to streamline application deployment.
- Manage Kubernetes clusters, including deploying applications, debugging issues, and ensuring security.
- Automate infrastructure provisioning and management using Terraform and scripting languages (Bash, Python).
- Collaborate with development teams to optimize application performance and troubleshoot issues.
Qualifications
- Education: Bachelor's degree in Computer Science, Information Technology, or a related field.
- Technical Skills:
- Hands-on experience with AWS services: EC2, EKS, SES, SQS, SNS, S3, DynamoDB, RDS/Aurora, OpenSearch, Elasticache, Security Groups, CloudWatch.
- High proficiency in Terraform for Infrastructure as Code (IaC).
- Strong scripting abilities in Bash and Python.
- Familiarity with the Go programming language.
- Expertise in using the AWS Command Line Interface (CLI).
- Proficiency with Kubectl for Kubernetes cluster management.
- Experience with Helm for Kubernetes package management.
Preferred Skills
- Hands-on experience with monitoring and observability tools like Datadog.
- Familiarity with performance testing tools such as BlazeMeter.
- Understanding of on-call management systems like PagerDuty.
- High-level knowledge of AWS networking concepts including VPCs and subnets.
Experience
- 8 to 12 years of relevant experience in site reliability engineering, cloud services, and infrastructure management.
Environment
Not specified.
Salary
Not specified.
Growth Opportunities
Not specified.
Benefits
Not specified.