Data Engineer - Pittsburgh, United States - Sheetz
Description
Overview:
Responsible for developing complex, large scale data models and pipelines that
organize and standardize the data to make it readily accessible and consumable by the business for
reporting and data science needs. Collaborate with various areas of the Business in order to determine
and source the appropriate data through internal and external means. Investigate new and existing
technologies and data sources and assess their viability within the Sheetz environment.
1. Co-lead in the selection and build of our data and analytic tools including Enterprise Data Warehouse, Data Lake, Analytics platform and Data Catalogue.
2. Design, implement, manage data architecture and data pipeline across multiple data sources.
3. Collaborate closely with Business and Technical Owners in identifying data sources and aggregating structured or unstructured data into dimensional data models able to be consumed by the business.
4. Identify potential process improvements and designs and implement automated solutions.
5. Manage the democratization of data knowledge across the organization through the communication and maintenance of master data, metadata, data management repositories, data models, and data standards
6. Establish standardization and educate data users on query best practices allowing for re-use and analytic efficiency.
(Equivalent combinations of education, licenses, certifications and/or experience may be considered. Two years of experience is equivalent to one year of college/trade school)
Education
Bachelor degree in Computer Science, Management Information Systems, Computer Engineering or a related field required.
Masters degree in Computer Science, MIS or Computer Engineering preferred.
Experience
Minimum 5 years working in a Data Engineer role required
Minimum 5 years working with large databases and data warehouses utilizing both relational and non-relational data models required
End to end experience in the analytic lifecycle from structured/unstructured raw data, data wrangling, creating data pipelines, to self-service dashboards leveraged by the business required
Advanced knowledge of SQL required
Minimum 5 years development experience in at least one object-oriented language (Python, Perl, Java, etc.) required
Proficiency in data visualization tool (Tableau) preferred
Strong understanding of cloud computing database technologies (Azure, AWS, GCP) required
Experience building and optimizing Big Data pipelines, architectures and data sets required
Experienced with data wrangling and preparation for use within data science, business intelligence or similar analytical functions required
Licenses/Certifications
None Required
Tools & Equipment
General Office Equipment