Site Reliability Engineer | Scrabble

full-time

Posted on 09-07-2025

Job Description

Leadership – Site Reliability & Platform Architect

Company Overview

Information about the company is not provided.

Job Summary

The Site Reliability & Platform Architect will play a critical role in scaling a high-performance Software as a Service (SaaS) platform that powers logistics automation for over 500 global enterprises, including prominent names like Unilever and Apple. This role will lead the evolution of cloud infrastructure, DevOps maturity, and backend platform architecture to manage the complexity and scale of the systems.

Responsibilities

Own infrastructure architecture: Design and evolve cloud-native systems focusing on scalability, availability, cost efficiency, and security.
Lead backend platform design: Collaborate with product and engineering teams to create performant, modular, and reliable backend systems.
CI/CD & Deployment Strategy: Build and scale deployment pipelines, optimizing rollouts with blue-green and canary deployments to ensure smooth delivery processes.
Orchestrate systems: Manage containerized workloads using orchestration tools like Kubernetes (EKS/GKE) or ECS.
Observability & Performance: Standardize monitoring, tracing, and logging across systems, leading capacity planning and performance tuning efforts.
Infrastructure as Code (IaC): Define and maintain scalable infrastructure using tools like Terraform and Helm.
Mentor & Lead: Guide engineering teams in cloud architecture, system design, and operational excellence.
Champion reliability: Define Service Level Objectives (SLOs), Service Level Indicators (SLIs), incident response processes, and proactive fault mitigation.
Own system security: Establish and enforce best practices for infrastructure and application security, access control, secrets management, and compliance readiness.

Qualifications

Experience:
- 4+ years of backend or infrastructure experience in high-scale, production-grade systems.
- 3+ years of hands-on backend development experience (e.g., Ruby, Node.js, Python, or Java).
- 3+ years in system design, API development, and performance optimization.
- 1+ years in a technical leadership role focused on DevOps/SRE/platform engineering.
Technical Skills:
- Proven experience architecting and running infrastructure on AWS (preferred), Google Cloud Platform (GCP), or Azure.
- Deep understanding of cloud-native architecture, microservices, and distributed systems.
- Proficient with Docker, Kubernetes, Terraform, and observability tools (Prometheus, Grafana, ELK, OpenTelemetry).
- Strong programming/scripting skills in Python, Go, or Bash; comfortable reading production backend code (e.g., Ruby/Node).
- Experience with relational and NoSQL databases like PostgreSQL, MongoDB, and Redis.
Mindset: A solution-oriented mindset with the ability to balance technical depth with business context.

Preferred Skills

Experience with service mesh, multi-region high availability (HA) systems, or event-driven architectures.
Background in security, compliance, or cost optimization.

Experience

7 to 14 years of relevant experience, particularly in high-scale environments and leadership roles.

Environment

This position is based in Bangalore and is a full-time role.

Salary

Salary details are not provided.

Growth Opportunities

Information regarding potential career advancement opportunities within the company is not provided.

Benefits

Details of offered benefits are not provided.