- Using the latest Cloud storage and processing techniques, help design and implement a Data Lake architecture allowing our project to store and process terabytes of data daily. Utilize Parquet tables using a common storage format and enable processing using Python and Spark.
- Support in monitoring performance and optimizing bottlenecks in storage and data transfer for scale and cost-effectiveness. Help onboard new users and document processes on the platform.
- With support, analyze data sets within the platform to identify useful insights and patterns with computational modeling using ML libraries such as Pytorch and Tensorflow. These insights will then be incorporated into our product designs and these conceptional model code will then be made ready for use in a production environment.
- Help test, QA'd and document data flows and model definitions to ensure that ongoing support is available.
- We plan to use services offered by our cloud-based partners to enhance and expand our data with the help of third-party data sources and insights. To support this effort, assist in creating and documenting processes, contribute insight on offers from third-party providers, and help develop code that enables our platform to communicate with our partners through REST APIs and/or SDKs.
- Help create pipelines to pull data from various sources such as websites and APIs to then be transformed into our core storage format. Write these pipelines using Python and using tools such as Apache Spark and the Data Lake for efficient processing of the data.
- Assist in maintaining pipelines, ensure that they run efficiently and at the correct time. Perform data validation and data quality tasks with some supervision to ensure the accuracy of the data.
- Develop, with supervision, web crawlers to extract HTML from websites using tooling, such as Python, BeautifulSoup, Selenium, or similar. The crawlers should work on both plain HTML and AJAX-based websites. Provide ongoing support and maintenance for the crawlers, under supervision, to ensure they continue to function properly.
- A combination of relevant internship or work experience and education that would equal 1-3 years of relevant experience with a record of accomplishment
- Proficiency in Python.
- Experience with distributed systems.
- Strong knowledge of data storage technologies
- Familiarity with relational databases and Elasticsearch.
- Experience tuning data systems for performance and reliability.
- Development experience with PyTorch and TensorFlow on both CPU and GPU targets.
- Knowledge of text processing and image processing techniques.
- Experience with extracting data from APIs
- Education: Bachelor's degree or equivalent work-related experience
- Experience in data lakes and data mesh architectures
- Experience with web scraping
-
Data Platform Engineer
3 weeks ago
People Tech Group Inc Princeton, United StatesSr. Data Platform Engineer · Design and implement Azure cloud-based Data Warehousing and Governance architecture with Lakehouse paradigm · Integrating technical functionality, ensuring data accessibility, accuracy, and security. · Architect the Unity Catalog to provide centralize ...
-
Bloomberg Princeton, United StatesSenior Data Management Professional – Data Engineering - Entities · Princeton, NJ · Posted Apr 18, Requisition No · Bloomberg runs on data. Our products are fueled by powerful information. We combine data and context to paint the whole picture for our clients, around the clock ...
-
Data Science Engineer
1 week ago
ZS Princeton, New Jersey, United States Permanent: · ZS is a place where passion changes lives. As a management consulting and technology firm focused on transforming global healthcare and beyond, our most valuable asset is our people. Here you'll work side-by-side with a powerful collective of thinkers and experts shaping life ...
-
Principal Data Engineer
3 days ago
Abbott Laboratories company Princeton, United StatesAbbott is a global healthcare leader that helps people live more fully at all stages of life. Our portfolio of life-changing technologies spans the spectrum of healthcare, with leading businesses and products in diagnostics, medical devices, nutritionals and branded generic medic ...
-
Principal Data Engineer
3 days ago
Abbott Princeton, United StatesAbbott is a global healthcare leader that helps people live more fully at all stages of life. Our portfolio of life-changing technologies spans the spectrum of healthcare, with leading businesses and products in diagnostics, medical devices, nutritionals and branded generic medic ...
-
Data Science Engineer
6 days ago
ZS Associates Princeton, United StatesZS · is a place where passion changes lives. As a management consulting and technology firm focused on transforming global healthcare and beyond, our most valuable asset is our people. Here you'll work side-by-side with a powerful collective of thinkers and experts shaping solut ...
-
Commercial Data Engineer
1 week ago
Genmab Princeton, United StatesJob Description · At Genmab, we're committed to building extra[not]ordinary futures together, by developing antibody products and pioneering, knock-your-socks-off therapies that change the lives of patients and the future of cancer treatment and serious diseases. From our people ...
-
senior Data Engineer
4 days ago
Triunity Software Princeton, United StatesJob Description · Job DescriptionWere seeking a Senior Data Engineer to enhance our Data Science Team, focusing on implementing and managing data workflows that support machine learning models and large-scale analytics. This role involves designing ETL processes, ensuring data qu ...
-
Big Data Engineer
2 days ago
Diverse Lynx Princeton, United StatesJob Type : Fulltime · Job Details · IT Experience : 8+ yrsBachelor's degree in Computer Science Engineering or equivalent. · Bigdata: 5-6 years of Hands-on experience in Hadoop / Spark (Scala or Python), Hbase, Hive, Sqoop. · Knowledge on Database architectures of RDBMS & No-SQ ...
-
Data Platform Engineer
6 days ago
Georgia IT Inc Princeton, United StatesData Platform Engineer · Location: Princeton NJ (prefer onsite) · Duration: 6 months · No third party C2C Design and implement Azure cloud-based Data Warehousing and Governance architecture with Lakehouse paradigm · Integrating technical functionality, ensuring data accessibili ...
-
Data Platform Engineer
2 days ago
Mathematica Policy Research Princeton, United StatesPosition Description: · Mathematica applies expertise at the intersection of technology, data, methods, policy, and practice to improve well-being around the world. We collaborate closely with public- and private-sector partners to translate big questions into deep insights that ...
-
Data Science Algorithm Engineer
6 days ago
Inside Higher Ed Princeton, United StatesThe Department of Astrophysical Sciences is seeking a Data Science Algorithm Engineer to work on developing pipelines and algorithms for survey science projects. The role involves software development for reduction, analysis, interpretation, and testing of photometric and spectro ...
-
Data Engineer
4 days ago
Collabera Pennington, United StatesHome · Search Jobs · Job Description · Data Engineer · Contract: Pennington, New Jersey, US · Salary: $73.00 Per Hour · Job Code: · End Date: · Days Left: 28 days, 3 hours left · Apply · Client: Banking/Finance · Location: Summit, New Jersey (Hybrid) · Position: Data Eng ...
-
data engineer
1 week ago
Randstad Pennington, United Statesdata engineer (python). · pennington , new jersey · posted today · job details · summary · $60 - $68 per hour · contract · bachelor degree · category computer and mathematical occupations · reference · job details · job summary: · Top 3 Skills · Python · Pandas Data ...
-
Azure Data Engineer
1 week ago
AppLab Systems Inc Princeton, United StatesPosition: Sr. Azure Data Engineer who can design the frameworks for data pipelines from Scratch · Location: Remote (USA) · Project Duration: 12+ Months · Job Description · Requirements: · Hands on experience with Azure data platform stack: Azure Databricks, Azure Data Factory, Az ...
-
Data Engineer
1 day ago
Collabera Pennington, United StatesHome · Search Jobs · Job Description · Data Engineer · Contract: Pennington, New Jersey, US · Salary: $73.00 Per Hour · Job Code: · End Date: · Days Left: 25 days, 3 hours left · Apply · Client: Banking/Finance · Location: Summit, New Jersey (Hybrid) · Position: Data Eng ...
-
data engineer
1 week ago
Randstad Pennington, United Statesdata engineer (python). · pennington , new jersey · posted 3 days ago · job details · summary · $60 - $68 per hour · contract · bachelor degree · category computer and mathematical occupations · reference · job details · job summary: · Top 3 Skills · Python · Pandas Data ...
-
Genmab Princeton, United StatesAt Genmab, we're committed to building extra[not]ordinary futures together, by developing antibody products and pioneering, knock-your-socks-off therapies that change the lives of patients and the future of cancer treatment and serious diseases. From our people who are caring, ca ...
-
Data Engineer
6 days ago
BeaconFire Solution Cranbury, United StatesQualifications: · ? Passion for data and a deep desire to learn. · ? Masters Degree in Computer Science/Information Technology, Data Analytics/Data · Science, or related discipline. · ? Intermediate Python. Experience in data processing is a plus. (Numpy, Pandas, etc) · ? Ex ...
-
Data Engineer
3 days ago
PALAYEKAR COMPANIES INC d/b/a PALNAR Cranbury, United StatesParticipate in multiple phases of IT project life cycle development including requirement gathering, software design, development, and testing of computer applications. Design and customize software for client use. Design, document, build, test and deploy data pipelines that asse ...
Data Engineer - Princeton, United States - InsideHigherEd
![Default job background](https://contents.bebee.com/public/img/bg-user-ex-1.jpg)
Description
OverviewThe Accelerator seeks a Data Engineer to work with team members to assist in developing, deploying, and improving data-intensive applications and processes. As part of a small cross-functional team, this individual will participate in product design and iterative development to support the mission of powering policy-relevant research by building shared infrastructure.
As someone growing in their expertise, this individual usually plans and executes tasks requiring judgment, adapting standard techniques, and sometimes creating new methods to solve problems. They have enough experience to be confident in their abilities and have completed projects. They typically work independently, receiving instructions on the expected outcomes, occasional technical guidance for uncommon issues, and approval from supervisors before starting projects. They collaborate with others to resolve important questions and coordinate work. They may use advanced techniques.
A remote work arrangement within the United States may be considered for candidates with the appropriate background and experience. University-paid business travel to Princeton, NJ may be required approximately 2-4 times per year.
The term of this appointment is 1 year, with the possibility of renewal based upon satisfactory performance and funding.
ResponsibilitiesData Lake Design, Implementation and Maintenance
Data Science and Data Augmentation Analysis
Cloud Based Data Processing
Data Ingestion Pipeline Development
Web based Crawler Development
Essential Qualifications:
Preferred Qualifications:
We at the School of Public and International Affairs believe that it is vital to cultivate an environment that embraces and promotes diversity, equity and inclusion - fundamental to the success of our education and research mission. This commitment to diversity informs our efforts in recruitment and hiring as we actively seek colleagues of exceptional ability who represent a broad range of viewpoints, experiences and value systems, and who share Princeton University's dedication to excellence.
Princeton University is an Equal Opportunity/Affirmative Action Employer and all qualified applicants will receive consideration for employment without regard to age, race, color, religion, sex, sexual orientation, gender identity or expression, national origin, disability status, protected veteran status, or any other characteristic protected by law. KNOW YOUR RIGHTS
Standard Weekly Hours36.25Eligible for OvertimeNoBenefits EligibleYesProbationary Period180 daysEssential Services Personnel (see policy for detail)NoPhysical Capacity Exam RequiredNoValid Driver's License RequiredNo Experience LevelMid-Senior Level#Ll-DP1