No more applications are being accepted for this job
- Top skills needed
- Python, PySpark , frameworks and libraries (e.g. Numpy, Pandas, SciPy, Pytorch, Pyarrow)
- Modeling data in relational databases (e.g., PostgreSQL, MySQL) and file-based databases, ETL processes and data warehousing concept
- AWS services such as S3, Glue, EMR, and Redshift, deploying data solutions on cloud platforms
- Design, develop, and maintain robust data pipelines using Python and PySpark to process large volumes of healthcare data efficiently in a multitenant analytics platform.
- Collaborate with cross-functional teams to understand data requirements, implement data models, and ensure data integrity throughout the pipeline.
- Optimize data workflows for performance and scalability, considering factors such as data volume, velocity, and variety.
- Implement best practices for data ingestion, transformation, and storage in AWS services such as S3, Glue, EMR, and Redshift.
- Model data in relational databases (e.g., PostgreSQL, MySQL) and file-based databases to support data processing requirements.
- Design and implement ETL processes using Python and PySpark to extract, transform, and load data from various sources into target databases.
- Troubleshoot and enhance existing ETLs and processing scripts to improve efficiency and reliability of data pipelines.
- Develop monitoring and alerting mechanisms to identify proactively and address data quality issues and performance bottlenecks.
Senior Data Engineer - Fort Lauderdale, United States - Quess
![Default job background](https://contents.bebee.com/public/img/bg-user-ex-1.jpg)
Description
5 years experience
DUTIES AND RESPONSIBILITIES: