Sr Data Engineer | Codersbrain
full-time
Posted on April 10, 2025
Job Description
Sr Data Engineer
Job Summary
As a Sr Data Engineer on the AWS Cloud team, you will be responsible for designing and developing data ingestion pipelines from various data sources into the cloud. You will lead the delivery of data products, leveraging Cloud Native strategies and best practices, drawing from over 15 years of IT experience.
Responsibilities
- Design and develop distributed systems capable of handling petabytes of data.
- Lead the development of data lakes with data ingestion from disparate sources, including relational databases, flat files, APIs, and streaming data.
- Architect and implement robust ETL pipelines using AWS Glue, defining data extraction methods, transformation logic, and data loading procedures.
- Collaborate with the infrastructure team for AWS service provisioning for databases, services, network design, IAM roles, and AWS cluster.
- Design, orchestrate, and schedule jobs using Airflow.
- Utilize large language models (LLMs) for data classification and identification of PII data entities.
Qualifications
- 15 years of experience in the design and delivery of distributed systems.
- 10 years of experience in the development of data lakes and CI/CD pipelines (GitHub Actions, Jenkins).
- Expertise in core AWS Services such as AWS IAM, VPC, EC2, EKS/ECS, S3, RDS, and others.
- Proficiency in programming languages like Python and PySpark.
- Experience in using Infrastructure as Code (IaC) tools like Terraform.
- Ability to work with NoSQL databases like Document DB.
- Knowledge of AWS AI services like AWS Entity Resolution and AWS Comprehend.
Preferred Skills
- Experience in the development of data audit, compliance, and retention standards for data governance.
- Familiarity with column-oriented data file formats like Apache Parquet and Apache Iceberg.
- Expertise in developing Retrieval-Augmented Generation (RAG) and Agentic Workflows for LLMs.
- Ability to develop re-ranking strategies for improving LLM output quality.
Experience
- 15+ years in IT, with significant experience in distributed systems, AWS services, and data engineering.
- 2-3 years of experience working with NoSQL databases like Document DB.
Environment
- Location: Bangalore, Hyderabad, Pune, Indore, Noida, Mumbai, Ahmedabad, Chennai, Kolkata.
- This is a full-time position with an immediate start date.
Growth Opportunities
Potential for career advancement within the cloud and data engineering teams, expanding into roles such as Data Architect or Cloud Solutions Architect.