Sr. Data Engineer - San Francisco, United States - hims & hers

    Default job background
    Description
    We're looking for a savvy and experienced

    Senior Data Engineer

    to join the Data Platform Engineering team at Hims.

    As a Senior Data Engineer, you will work with the analytics engineers, product managers, engineers, security, DevOps, analytics, and machine learning teams to build a data platform that backs the self-service analytics, machine learning models, and data products serving 900,000+ Hims & Hers users.


    You Will:
    Architect and develop data pipelines to optimize performance, quality, and scalability

    Build, maintain & operate scalable, performant, and containerized infrastructure required for optimal extraction, transformation, and loading of data from various data sources

    Design, develop, and own robust, scalable data processing and data integration pipelines using Python, dbt, Kafka, Airflow, PySpark, SparkSQL, and REST API endpoints to ingest data from various external data sources to Data Lake

    Develop testing frameworks and monitoring to improve data quality, observability, pipeline reliability, and performance

    Orchestrate sophisticated data flow patterns across a variety of disparate tooling

    Support analytics engineers, data analysts, and business partners in building tools and data marts that enable self-service analytics

    Partner with the rest of the Data Platform team to set best practices and ensure the execution of them

    Partner with the analytics engineers to ensure the performance and reliability of our data sources

    Partner with machine learning engineers to deploy predictive models

    Partner with the legal and security teams to build frameworks and implement data compliance and security policies

    Partner with DevOps to build IaC and CI/CD pipelines

    Support code versioning and code deployments for data Pipelines

    You Have:

    8+ years of professional experience designing, creating and maintaining scalable data pipelines using Python, API calls, SQL, and scripting languages

    Demonstrated experience writing clean, efficient & well-documented Python code and are willing to become effective in other languages as needed

    Demonstrated experience writing complex, highly optimized SQL queries across large data sets

    Experience with cloud technologies such as AWS and/or Google Cloud Platform

    Experience with Databricks platform

    Experience with IaC technologies like Terraform

    Experience with data warehouses like BigQuery, Databricks, Snowflake, and Postgres

    Experience building event streaming pipelines using Kafka/Confluent Kafka

    Experience with modern data stack like Airflow/Astronomer, Databricks, dbt, Fivetran, Confluent, Tableau/Looker

    Experience with containers and container orchestration tools such as Docker or Kubernetes

    Experience with Machine Learning & MLOps

    Experience with CI/CD (Jenkins, GitHub Actions, Circle CI)

    Thorough understanding of SDLC and Agile frameworks

    Project management skills and a demonstrated ability to work autonomously


    Nice to Have:
    Experience building data models using dbt

    Experience with Javascript and event tracking tools like GTM

    Experience designing and developing systems with desired SLAs and data quality metrics

    Experience with microservice architecture

    Experience architecting an enterprise-grade data platform

    #J-18808-Ljbffr