DBA [ Snowflake & SQL Server ]
Company Overview
Not specified.
Job Summary
We are hiring a Senior Production Database Administrator (DBA) who operates at the intersection of precision and speed. This role involves owning and managing Snowflake, which serves as our Massively Parallel Processing (MPP) backbone for Customer360, Article360, and all other master data and business data marts. The DBA will be responsible for overseeing and optimizing our SQL Server farm (15–20 servers) that contributes to the Operational Data Store, facilitating a real-time API stack supported by Java and Python. The ideal candidate will focus on tuning long-running queries, preventing costly resource shuffles, hardening high availability/disaster recovery (HA/DR), and ensuring smooth data flow across AWS and Microsoft ecosystems. Retail experience is a plus.
Responsibilities
-
Snowflake Stewardship (MPP):
- Operate and optimize warehouses, scaling policies, resource monitors, tasks/streams, and SnowPipe; enforce role-based access control (RBAC), masking and row access policies, object tagging, and auditing.
- Investigate query performance issues using tools such as QUERY_HISTORY, WAREHOUSE_LOAD_HISTORY, GET_QUERY_OPERATOR_STATS, and profile views, implementing fixes for spills, skew, poor pruning, and high network bytes.
- Reduce costs and latency through informed clustering choices, micro-partition awareness, result cache strategy, and workload isolation (development/test/production environments).
-
SQL Server Operational Data Store (15–20 servers):
- Manage high availability/disaster recovery systems (Always On Availability Groups), develop backup/restore strategies (full/differential/log), and ensure adherence to recovery point objective (RPO) and recovery time objective (RTO).
- Troubleshoot live issues with tools such as Query Store, Extended Events, Dynamic Management Views (DMVs), and help diagnose performance issues using tools like sp_WhoIsActive and SQL Sentry.
- Design systems to support real-time API read patterns, including connection pooling, read replicas, hot indexes, and service level objectives (SLOs) for p95/p99 performance.
-
Performance Engineering & Firefighting:
- Analyze and triage long-running queries, addressing large-scale data reshuffles caused by Business Intelligence (BI) tools, and eliminate Cartesian blow-ups through early filtering strategies.
- Collaborate with Data Engineering to pre-aggregate or materialize heavy joins and codify repeatable fix patterns.
-
Streaming & Change Data Capture:
- Enable low-latency data flows through Change Data Capture (CDC) and ingestion to Snowflake via Snowpipe, Streams/Tasks, and Kafka/Debezium where applicable.
- Ensure the freshness of end-to-end data and reconcile API-facing service level agreements (SLAs).
-
Observability & Incident Response:
- Build dashboards and alerts using monitoring tools such as CloudWatch, Azure Monitor, and Datadog to track queue times, resource spills, credit usage, CPU, I/O, waits, high availability group health, and API dependencies.
- Participate in on-call rotations, conduct blameless postmortems, and track remediation from start to finish.
-
Security, Compliance, and Governance:
- Enforce data security protocols, including least privilege access, encryption, network policies/private links, and auditing; partner with the security team to manage SOX and PII safeguards and implement data retention strategies.
-
Automation & DevOps:
- Implement Infrastructure-as-Code for database objects and platform configurations using Terraform or CloudFormation; establish CI/CD practices for schema changes, gated releases, and drift detection.
- Automate repetitive tasks using scripting languages such as PowerShell, Python, and Bash; create self-service runbooks for common operational tasks.
-
Capacity & Cost Management:
- Forecast growth and manage warehouse/instance sizing, track credits and licensing, and recommend right-tiering and reservation strategies.
-
Retail Analytics Partnership:
- Collaborate with BI/Analytics teams (e.g., MicroStrategy, Looker, Power BI) to create high-performing SQL queries that efficiently filter data and align with clustering and joining keys common in retail KPIs, including traffic, conversion, basket size, promotions, and returns.
-
AI-Adjacent Data Readiness (Nice to Have):
- Support data reliability for applied AI workloads, ensuring feature refresh SLAs, inference-safe schemas, guardrails for Language Model-driven SQL generation, and baseline performance metrics.
Qualifications
Preferred Skills
- Experience in retail analytics and understanding of retail KPIs.
- Familiarity with data governance, compliance, and security frameworks.
- Knowledge of automation tools and DevOps practices.
Experience
- 7 to 10 years of relevant experience in database administration, with a focus on Snowflake and SQL Server.
Environment
Descriptive details about the work setting, such as remote, in-office, or hybrid arrangements, are not specified.
Salary
Not specified.
Growth Opportunities
Potential career advancement opportunities have not been specified.
Benefits
Benefits information has not been provided.