Jobs
>
Westport

    Data Engineer - Westport, United States - Catalytic Data Science

    Catalytic Data Science
    Catalytic Data Science Westport, United States

    2 weeks ago

    Default job background
    Description

    Job Description

    Job DescriptionSalary:

    Data Engineer III (Large Language Models)

    About Catalytic Data Science (CDS):

    Catalytic Data Science is a groundbreaking cloud R&D platform designed to integrate volumes of scientific resources, data, and analytic tools while providing the ability to network with colleagues in one secure and scalable environment. By enabling R&D teams to work more collaboratively and improving productivity company-wide, the Catalytic platform helps teams achieve key R&D milestones faster and with greater accuracy. Our customers are passionate about making the world a better place, and we are inspired by the opportunity to help them.

    The Role

    You are a Data Engineer with experience in processing terabytes of data and working with large language models (LLMs). You have experience in creating and automating scalable, fault-tolerant, and reproducible data pipelines for natural language processing (NLP) using Amazon AWS technologies. You will design and implement data ingestion, processing, and storage solutions that can handle massive amounts of text data from various sources. You are interested in helping to create a platform completely built on top of AWS. You are eager to join a team of Life Scientists and Software Engineers that believe the brightest minds in research should have the best tools to drive innovation.

    What You'll Do

    • Build, test, and operate automated Extract, Transform, and Load (ETL) pipelines that process terabytes of text data nightly
    • Develop service frontends around our various backend data stores (AWS Aurora, MySQL, Elasticsearch, S3)
    • Rapidly protype, test, and deploy data pipelines for LLMs using AWS.
    • Collaborate with data scientists and NLP engineers to understand the data requirements and specifications for LLMs and related tasks such as text summarization, translation, and question answering.
    • Optimize the performance, reliability, and scalability of the data pipelines and LLMs by applying best practices and techniques such as data partitioning, caching, compression, and monitoring.
    • Ensure the quality, integrity, and security of the data by implementing data validation, cleaning, and governance policies and procedures.
    • Research and evaluate new technologies and methods for data engineering and LLMs and stay updated with the latest trends and developments in the field.
    • Participate in data architecture and engineering decisions, bringing your strong experience and knowledge to bear.

    Qualifications

    • Bachelor's degree or higher in computer science, engineering, or a related field.
    • 3+ years of experience in data engineering, preferably with large-scale text data and LLMs and 6+ years of any software engineering experience (including data engineering).
    • Proficient in Python 3 or Java, preferably both.
    • Experience with data modeling, ETL, and data warehouse design and implementation.
    • Expertise with ETL schedulers such as Airflow, Prefect or similar frameworks.
    • Familiar with LLMs and NLP concepts and frameworks such as Transformers, BERT, GPT, PaLM, and LLaMA.
    • Day-to-day experience using AWS technologies such as Lambda, ECS Fargate, SQS, & SNS
    • Experience extracting, processing, storing, and querying of petabyte-scale datasets
    • Familiarity with building and using containers
    • Familiarity with event-based microservices
    • Strong communication, collaboration, and problem-solving skills.

    Core Skills:

    1. ETL Processes
    2. Data Modeling and Database Design
    3. Proficiency in Large Language Models
    4. Data Pipeline Optimization
    5. Cross-functional Collaboration
    6. Problem-solving and Analytical Skills

    Nice-to-Haves

    • Prior experience with Elasticsearch (custom development and/or administration) is a huge plus
    • Knowledge of Graph databases

    What Do We Love in Team Members?

    Your specialization is less important than your ability to learn fast and adapt to shifting technologies. We're especially fond of people who:

    • Focus on customer's needs and our company's goals, not just writing code
    • Iterate until customers love what you've built
    • Self-start and initiate
    • Self-organize
    • Strive to grow personally and professionally, beyond just expanding technical abilities
    • Love to experiment with new technology and share knowledge with the team

    In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire.


    remote work

  • Bridgewater Associates

    Data Engineer

    1 week ago


    Bridgewater Associates Westport, United States

    About Bridgewater · Bridgewater Associates is a premier asset management firm, focused on delivering unique insight and partnership for the most sophisticated global institutional investors. · Our investment process is driven by a tireless pursuit to understand how the world's ...

  • Global Force USA

    Data Engineer

    1 week ago


    Global Force USA Norwalk, United States Full time

    Position title: Data engineer - Permanent · Location: 800 Connecticut Avenue, Norwalk, CT · Onsite: 3 days a week · Interview process: apex screening, then two video interviews to hire · Must have: 5 years experience or more in data bricks, building data bricks, and using Azure d ...


  • Crimson Enterprises Rocky Point, United States

    Are you ready to conquer the cloud world with us as a STACKITEER and shape the future of Europe? Then you've come to the right place at STACKIT. Our vision is ambitious: an independent Europe - digital, leading. As a cloud and colocation provider, we are building the secure infra ...

  • Provision People

    Senior Data Engineer

    2 weeks ago


    Provision People Norwalk, United States

    Job Description · Job DescriptionSummary: · Our award-winning client is seeking a Senior Data Engineer to join their team.In this critical role, you will play a pivotal role in shaping the future of our business intelligence capabilities. You will be responsible for the entire l ...


  • Provision People Norwalk, United States

    Summary: · Our award-winning client is seeking a Senior Data Engineer to join their team.In this critical role, you will play a pivotal role in shaping the future of our business intelligence capabilities. You will be responsible for the entire lifecycle of our data warehouse env ...


  • Money Fit by DRS Westport, United States

    Company Overview: · We are a technology-driven firm specializing in the development and implementation of quantitative trading strategies. Leveraging our proprietary platform, we provide data-driven insights to institutional investors and commercial hedgers across equities, futur ...


  • W3Global Inc Norwalk, United States

    Job Description · Job DescriptionAbout the job · We will consider candidates located near our Norwalk CT, or Itasca IL offices and work 5 days per workweek in office. · Position Overview · The Principal Data Engineer holds the primary responsibility for skillfully designing, deve ...


  • Mitsubishi HC Capital America Inc Norwalk, United States

    Job Description · Job Description We will consider candidates located near our Norwalk CT, or Itasca IL offices and work 5 days per workweek in office.Position Overview: · The Principal Data Engineer holds the primary responsibility for skillfully designing, developing, implement ...

  • Cannondale

    Data Engineer

    1 week ago


    Cannondale Wilton, United States

    For more than 50 years, Cannondale has been a leading innovator in the cycling world. As more riders of all ages and abilities get on the roads, trails, and streets than ever before, we're here to do the best work of our lives to push the greatest human-powered machine into the f ...

  • Ipsos

    Lead Data Engineer

    4 days ago


    Ipsos Norwalk, United States

    About the Team: · Ipsos Global Data Management Tower serves as the backbone of Ipsos' efforts to harness and integrate the power of data, globally. GDMT is tasked with the critical role of ensuring that data, one of the most valuable assets in the modern digital economy, is accu ...


  • Infinity Systems Mill Plain, United States

    The company ISR Information Products AG is currently offering this job in Berlin. If you are currently looking for a job in the IT field, you should check out the details on the company's website. There you will find detailed information about the position and the company. · What ...


  • Creative Financial Staffing Norwalk, United States

    Senior Data Engineer (On-Site) - Norwalk, CT · Why take the Senior Data Engineer position? · Opportunity to work in a stimulating, dynamic, and team-oriented environment · Chance to grow your skills and career in Data Engineering specifically in ETL, Data Lakes and Azure Synapse ...

  • Bridgewater Associates

    Data Associate

    3 weeks ago


    Bridgewater Associates Westport, United States

    **About Bridgewater**: · Bridgewater Associates is a premier asset management firm, focused on delivering unique insight and partnership for the most sophisticated global institutional investors. · Our investment process is driven by a tireless pursuit to understand how the world ...


  • LogicSource Westport, United States

    Overview: · **LogicSource Overview**: · LogicSource is an innovative leader in sourcing and procurement services, we help companies increase profits through better buying. LogicSource is owned and operated by an experienced group of global business veterans who came together unde ...


  • Eclipse Innovations Westport, United States

    Your tasks · Project management for procurement processes of welding, riveting, and other manufacturing equipment and fixtures · Industrialization of customer projects while adhering to cost, schedule, and quality goals · Regular maintenance of relevant project data to ensure glo ...

  • NovaTech Industries

    Civil Engineer

    2 days ago


    NovaTech Industries Westport, United States

    The Federal Highway Research Institute (BASt) is currently seeking a Civil Engineer (m/f/d) (Master/University Diploma) for the department "Fundamental Issues of Structural Maintenance" in the field of "Digital Technologies". This position is limited to 4 years and is part of the ...


  • Infinity Ventures Westport, United States

    We are agap2, a company for operational consulting in the fields of science and engineering. We are a strong team of engineers, pharmacists, and scientists. We help our renowned clients successfully carry out complex projects - from commissioning of plants to process optimization ...

  • Pryon

    Head of Operations

    1 week ago


    Pryon Westport, United States

    **In This Role, You Will**: · - Own and oversee all company day to day operations and act as a driving force for the company's expansion and growth · - Function as a strategic business partner to the President, identifying and assessing growth opportunities as well as near-term a ...


  • ASML Wilton, CT, United States Full time

    ASML is the world's leading provider of lithography systems for the semiconductor industry, manufacturing complex machines that are critical to the production of integrated circuits or microchips. This position is for a First Line Support Production Engineer to analyze production ...


  • ASML Wilton, CT, United States Full time

    ASML US brings together the most creative minds in science and technology to develop lithography machines that are key to producing faster, cheaper, more energy-efficient microchips. We design, develop, integrate, market and service these advanced machines, which enable our custo ...