No more applications are being accepted for this job
Sr. Data Engineer - San Francisco, United States - hims & hers
Description
We're looking for a savvy and experiencedSenior Data Engineer
to join the Data Platform Engineering team at Hims.
As a Senior Data Engineer, you will work with the analytics engineers, product managers, engineers, security, DevOps, analytics, and machine learning teams to build a data platform that backs the self-service analytics, machine learning models, and data products serving 900,000+ Hims & Hers users.
You Will:
Architect and develop data pipelines to optimize performance, quality, and scalability
Build, maintain & operate scalable, performant, and containerized infrastructure required for optimal extraction, transformation, and loading of data from various data sources
Design, develop, and own robust, scalable data processing and data integration pipelines using Python, dbt, Kafka, Airflow, PySpark, SparkSQL, and REST API endpoints to ingest data from various external data sources to Data Lake
Develop testing frameworks and monitoring to improve data quality, observability, pipeline reliability, and performance
Orchestrate sophisticated data flow patterns across a variety of disparate tooling
Support analytics engineers, data analysts, and business partners in building tools and data marts that enable self-service analytics
Partner with the rest of the Data Platform team to set best practices and ensure the execution of them
Partner with the analytics engineers to ensure the performance and reliability of our data sources
Partner with machine learning engineers to deploy predictive models
Partner with the legal and security teams to build frameworks and implement data compliance and security policies
Partner with DevOps to build IaC and CI/CD pipelines
Support code versioning and code deployments for data Pipelines
You Have:
8+ years of professional experience designing, creating and maintaining scalable data pipelines using Python, API calls, SQL, and scripting languages
Demonstrated experience writing clean, efficient & well-documented Python code and are willing to become effective in other languages as needed
Demonstrated experience writing complex, highly optimized SQL queries across large data sets
Experience with cloud technologies such as AWS and/or Google Cloud Platform
Experience with Databricks platform
Experience with IaC technologies like Terraform
Experience with data warehouses like BigQuery, Databricks, Snowflake, and Postgres
Experience building event streaming pipelines using Kafka/Confluent Kafka
Experience with modern data stack like Airflow/Astronomer, Databricks, dbt, Fivetran, Confluent, Tableau/Looker
Experience with containers and container orchestration tools such as Docker or Kubernetes
Experience with Machine Learning & MLOps
Experience with CI/CD (Jenkins, GitHub Actions, Circle CI)
Thorough understanding of SDLC and Agile frameworks
Project management skills and a demonstrated ability to work autonomously
Nice to Have:
Experience building data models using dbt
Experience with Javascript and event tracking tools like GTM
Experience designing and developing systems with desired SLAs and data quality metrics
Experience with microservice architecture
Experience architecting an enterprise-grade data platform
#J-18808-Ljbffr