HumanBit Logo

Site Reliability Engineer | Codersbrain

full-time
Posted on June 9, 2025

Job Description

Site Reliability Engineer

Company Overview

[Company name] is a leading technology firm specializing in cloud-based solutions and infrastructure management. We foster a culture of innovation and collaboration, striving to deliver high-quality services to our clients while ensuring a supportive environment for our employees.

Job Summary

The Site Reliability Engineer (SRE) plays a crucial role in maintaining and enhancing the reliability, performance, and efficiency of our systems. This position is responsible for designing and implementing scalable solutions in the AWS ecosystem, utilizing monitoring tools, and automating processes to improve operational workflows.

Responsibilities

  • Design, implement, and maintain scalable and reliable infrastructure on AWS.
  • Utilize Dynatrace for monitoring, performance tuning, and troubleshooting of applications and services.
  • Develop automation scripts to streamline deployment processes and enhance operational efficiency.
  • Lead chaos engineering initiatives to proactively identify weaknesses in our systems and improve resilience.
  • Collaborate with development teams to integrate reliability into the software development lifecycle.
  • Automate operational processes to enhance efficiency and reduce manual intervention.
  • Participate in on-call rotations to support incident response and resolution.

Qualifications

  • Educational Background: Bachelor's degree in Computer Science, Information Technology, or a related field.
  • Technical Skills:
    • Strong proficiency in AWS services (EKS, EC2, DynamoDB, Lambda, etc.)
    • Experience with Dynatrace or similar monitoring tools for application performance management.
    • Familiarity with chaos engineering principles and tools.
    • Solid understanding of load testing methodologies and tools.
    • Proficient in scripting languages and configuration management tools.
  • Soft Skills:
    • Excellent problem-solving skills.
    • Ability to work under pressure and manage multiple tasks effectively.

Preferred Skills

  • Previous experience in a Site Reliability Engineer or similar role.
  • Knowledge of container orchestration and management, particularly with Kubernetes.

Experience

  • Minimum of 5 years of relevant experience in Site Reliability Engineering or related fields.

Environment

  • This position can be performed in Kolkata, Mumbai, or remotely, providing flexibility in the work environment. Candidates are expected to commit to 8-9 hours of work daily.

Salary

  • Salary details will be discussed during the interview process.

Growth Opportunities

  • Opportunities for career advancement within the company as the SRE team expands and new projects are initiated.

Benefits

  • [List of benefits offered by the company, if available, otherwise remove this section.]
Powered by
HumanBit Logo