Data Architect | Scrabble
Posted on November 11, 2022
Job Description
<div>
<div>
<div>
<div>
<p>Primary Responsibilities</p>
<ul>
<li>
<p>● Collaborate with Tech and Analytics team to build and maintain the infrastructure required for optimal extraction, transformation, and loading of data from a variety of data sources</p>
</li>
<li>
<p>● Create and maintain scalable ETL pipelines that feed organization-wide data</p>
</li>
<li>
<p>● Design and maintain ideal architecture for data tables to ensure optimal</p>
<p>querying performance in relational databases</p>
</li>
<li>
<p>● Mentor the data engineers in the team on best practices and projects.</p>
</li>
<li>
<p>● Create and maintain connectors that expose the data securely for</p>
<p>consumption by downstream systems and services in near real-time.</p>
</li>
<li>
<p>● Help build the ML pipelines and integrate ML models in Zepto applications</p>
</li>
<li>
<p>● Build data governance and security protocols and monitor adherence</p>
<p>What Are We Looking For?</p>
</li>
</ul>
<ul>
<li>
<p>● 6 - 10 years of experience in Data Engineering - Designing databases, building data pipelines, and maintaining data governance protocols in cloud platforms</p>
</li>
<li>
<p>● Hands-on working experience with Python, ETL pipelines, advanced SQL</p>
</li>
<li>
<p>● Understanding of AWS Services - Redshift, Lambda, Glue, Athena, security</p>
<p>protocols</p>
</li>
<li>
<p>● Experience in any Cloud DW Redshift/Snowflake/BigQuery and working with</p>
<p>data layer solutions like Apache Hudi, DeltaLake, iceberg</p>
</li>
<li>
<p>● Experience in setting up a real time data processing system with Apache</p>
<p>Spark/ Apache Flink , pySpark.</p>
</li>
<li>
<p>● Design, Test-driven development, code review and implement CICD using</p>
<p>Github/Gitlab/Docker</p>
</li>
<li>
<p>● Experience in gathering and processing raw data at scale including writing</p>
<p>scripts and spark jobs.</p>
</li>
<li>
<p>● Comfortable to setup query engines like Presto, Trino, etc.</p>
</li>
<li>
<p>● Strong data Modelling and database design experience with Redshift or other</p>
<p>relational databases.</p>
</li>
</ul>
</div>
</div>
</div>
</div>