Associate Director of Translational and Quantitative Sciences Data Engineering - Princeton, United States - akkodis

    akkodis
    akkodis Princeton, United States

    2 weeks ago

    Akkodis background
    Description
    Akkodis is seeking an Associate Director of Translational and Quantitative Sciences Data Engineering for a Fulltime position with a client located in Princeton, NJ

    Salary Range:
    $220,000 to $270,000/ Year The Salary may be negotiable based on experience, education, geographic location, and other factorsTitle: Associate Director of Translational and Quantitative Sciences Data EngineeringLocation: Princeton, NJ (hybrid model requiring onsite presence 60% of the time in Princeton, NJ)

    Job Description:

    The successful candidate will contribute to the mission of the global data engineering function and be responsible for many aspects of data including creation of data-as-a-product, architecture, access, classification, standards, integration, and pipelines.

    Although your role will involve a diverse set of data-related responsibilities, your key focus will be on the enablement of data for the Translational and Quantitative Sciences functions, including Data Science, Translational Medicine, Precision Medicine, and Translational Research.

    You will have a balance of subject matter expertise in life science data, terminology and processes and technical expertise for hands-on implementation.

    You will be expected to lead a data engineering group to create workflows to standardize and automate data, connect systems, enable tracking of data, implement triggers and data cataloging.

    With your experience in the Research domain, you will possess knowledge of diverse assay types such as IHC, flow cytometry, cytokine data, genomics, and systems such as LIMS and ELN.

    Your ultimate goal will be to place data at the fingertips of stakeholders and enable science to go faster. You will join an enthusiastic, agile, fast-paced and explorative global data engineering team.

    ResponsibilitiesDesign, implement and manage ETL data pipelines that ingest vast amounts of scientific data from public, internal and partner sources into various repositories on a cloud platform (AWS)Lead and coordinate a team of data engineers to meet stakeholder expectations while also maintaining conformity of enterprise data management practicesEnhance end-to-end workflows with automation that rapidly accelerate data flow with pipeline management tools such as Step Functions, Airflow, or Databricks WorkflowsImplement and maintain bespoke databases for scientific data (RWE, in-house labs, CRO data) and consumption by analysis applications and AI productsInnovate and advise on the latest technologies and standard methodologies in Data Engineering and Data Management, including recent advancements with GenAI, and be able to identify solutions that can address hurdles in data enablement such as AI-powered data harmonizationManage relationships and project coordination with external parties such as Contract Research Organizations (CRO) and vendor consultants / contractorsDefine and contribute to data engineering practices for the group, establishing shareable templates and frameworks, determining best usage of specific cloud services and tools, and working with vendors to provision cutting edge tools and technologiesCollaborate with stakeholders to determine best-suited data enablement methods to optimize the interpretation of the data, including creating presentations and leading tutorials on data usage as appropriateApply value-balanced approaches to the development of the data ecosystem and pipeline initiativesProactively communicate data ecosystem and pipeline value propositions to partnering collaborators, specifically around data strategy and management practicesRequirementsBS/MS in Computer Science, Bioinformatics, or a related field with 12+ years of software engineering experience or a PhD in Computer Science, Bioinformatics or a related field and 8+ years of software engineering experienceExcellent skills and deep knowledge of ETL pipeline, automation and workflow managements tools such as Airflow, AWS Glue, Amazon Kinesis, AWS Step Functions, and CI/CD is a mustExcellent skills and deep knowledge in Python, Pythonic design and object-oriented programming is a must, including common Python libraries such as pandas.

    Experience with R a plusSolid understanding of modern data architectures and their implementation offerings including Databricks Delta Tables, Athena, Glue, Iceberg, and their applications to Lakehouse and medallion architectureSolid understanding of data paradigms such as data fabrics and data meshes, including roles of domain-specific data product ownershipExperience working with clinical data and understanding of GxP compliance and validation processesUnderstanding of data quality tools and data governance toolsProficiency with modern software development methodologies such as Agile, source control, project management, and issue tracking with JIRAProficiency with container strategies using Docker, Fargate, and ECRProficiency with AWS cloud computing services such as Lambda functions, ECS, Batch and Elastic Load Balancer, and other compute frameworks such as Spark, EMR, and Databricks.


    Benefits include but are not limited to:
    Medical/Dental/Vision401KPTO/Paid Holidays#J-18808-Ljbffr