Data Engineer - Pittsburgh, United States - Sheetz

    Default job background
    Description

    Overview:

    Responsible for developing complex, large scale data models and pipelines that
    organize and standardize the data to make it readily accessible and consumable by the business for
    reporting and data science needs. Collaborate with various areas of the Business in order to determine
    and source the appropriate data through internal and external means. Investigate new and existing
    technologies and data sources and assess their viability within the Sheetz environment.

    Responsibilities:

    1. Co-lead in the selection and build of our data and analytic tools including Enterprise Data Warehouse, Data Lake, Analytics platform and Data Catalogue.
    2. Design, implement, manage data architecture and data pipeline across multiple data sources.
    3. Collaborate closely with Business and Technical Owners in identifying data sources and aggregating structured or unstructured data into dimensional data models able to be consumed by the business.
    4. Identify potential process improvements and designs and implement automated solutions.
    5. Manage the democratization of data knowledge across the organization through the communication and maintenance of master data, metadata, data management repositories, data models, and data standards
    6. Establish standardization and educate data users on query best practices allowing for re-use and analytic efficiency.

    Qualifications:

    (Equivalent combinations of education, licenses, certifications and/or experience may be considered. Two years of experience is equivalent to one year of college/trade school)

    Education
    Bachelor degree in Computer Science, Management Information Systems, Computer Engineering or a related field required.
    Masters degree in Computer Science, MIS or Computer Engineering preferred.

    Experience
    Minimum 5 years working in a Data Engineer role required
    Minimum 5 years working with large databases and data warehouses utilizing both relational and non-relational data models required
    End to end experience in the analytic lifecycle from structured/unstructured raw data, data wrangling, creating data pipelines, to self-service dashboards leveraged by the business required
    Advanced knowledge of SQL required
    Minimum 5 years development experience in at least one object-oriented language (Python, Perl, Java, etc.) required
    Proficiency in data visualization tool (Tableau) preferred
    Strong understanding of cloud computing database technologies (Azure, AWS, GCP) required
    Experience building and optimizing Big Data pipelines, architectures and data sets required
    Experienced with data wrangling and preparation for use within data science, business intelligence or similar analytical functions required

    Licenses/Certifications
    None Required

    Tools & Equipment
    General Office Equipment