HumanBit Logo

Senior Principal Site Reliability Engineer | Codersbrain

full-time
Posted on September 10, 2025

Job Description

Sr Principal Site Reliability Developer

Company Overview

Not specified.

Job Summary

The Sr Principal Site Reliability Developer is responsible for tackling complex challenges related to infrastructure cloud services and developing automation tools to mitigate future issues. This role primarily focuses on enhancing the performance, availability, and scalability of Oracle products and services, contributing significantly to the organization's goal of ensuring reliable and efficient service delivery.

Responsibilities

  • Collaborate with the Site Reliability Engineering (SRE) team to ensure shared full-stack ownership of multiple services and/or technology areas.
  • Understand end-to-end configuration, technical dependencies, and behavior of production services.
  • Design and deliver mission-critical stacks, emphasizing security, resiliency, scalability, and performance.
  • Serve as an authority on end-to-end performance and operability of services.
  • Partner with development teams to determine and implement improvements in the service architecture.
  • Articulate technical characteristics of services and guide development teams in enhancing the Oracle Cloud service portfolio.
  • Demonstrate a clear understanding of automation and orchestration principles.
  • Act as the ultimate escalation point for complex or critical issues lacking established Standard Operating Procedures (SOPs).
  • Utilize knowledge of service topology and dependencies to troubleshoot problems and recommend mitigations.
  • Analyze the impact of product architecture decisions on distributed systems, showcasing professional curiosity and a desire for in-depth understanding of services and technologies.

Qualifications

  • 10+ years of experience in site reliability, cloud infrastructure, or a related field.
  • Strong proficiency in English, both written and spoken.
  • Deep understanding of automation and orchestration principles.
  • Experience in designing and developing large-scale distributed systems.
  • Knowledge of service capacity planning, demand forecasting, software performance analysis, and system tuning.
  • Ability to articulate and communicate technical characteristics effectively.
  • Professional curiosity and a proactive approach to problem-solving.

Preferred Skills

  • Experience with Oracle products and services, as well as familiarity with the Oracle Cloud service portfolio.
  • Prior experience in working with complex distributed systems and performing system tuning.

Experience

  • 10+ years of relevant experience in site reliability engineering or related fields.

Environment

  • Location: Bengaluru, Karnataka, India.
  • This position is not eligible for visa or work permit sponsorship.
  • The role is an individual contributor position within the Product Development job family.

Salary

Not specified.

Growth Opportunities

Not specified.

Benefits

Not specified.

Powered by
HumanBit Logo