Engineering Manager - Cambridge, United States - Broad Institute

    Default job background
    Description
    Job Description

    At the

    Broad Institute of MIT & Harvard

    broadly and within the

    Neale Lab


    specifically, we leverage statistical and software techniques to understand the mechanisms of disease from extremely large datasets generated by scalable sequencing technologies.

    The lab and Institute are entering an age of one million sequences, millions of transcriptomes, tens of thousands of medical images, and complete medical records.

    The development of scalable scientific assays has transformed biological engineering problems into software engineering ones. We seek a software engineering manager to lead the team in solving those problems.

    We are seeking an Engineering Manager to join a team that develops, maintains, and operates

    Hail,

    a suite of libraries, data systems, and services for analyzing the world's largest genome sequencing datasets.

    Hail supports scientists beginning with individual sequences through the production of a sequencing matrix, the calculation of per-row and per-column statistics, distributed matrix multiplications to search for genetic relatedness, preparation of thousands of phenotypes per sequence, regression to search for genetic associations with phenotypes, subsetting and export for distribution to collaborators, and as a data store for web-based data browsers and rare disease diagnostic support systems.

    The team faces three major challenges in the coming years. First, the largest sequencing callset has doubled every year since 2003 and the next doubling is anticipated in 2025. Second, the phenotypes have grown from binary disease status tables to medical records, medical images, and cellular assays. Third, the project must adapt to the changing hardware landscape, new scientific-analytical techniques, and new analytical databases.

    Hail's two core products are Query and Batch, both of which are open-source and openly developed. Query is a partitioned, horizontally-scalable, spot-tolerant, data frame system exposing a Python API. Batch is a cost-metered, multi-tenant, spot-tolerant, elastic, horizontally-scalable compute engine.

    The team operates an installation of Batch as a Software-as-a-Service for a community of hundreds of scientists within the Broad Institute.

    Query is largely implemented in Scala, but makes extensive use of native memory. Query relies on many technologies including OW2 ASM, Apache Spark, Google and Azure cloud storage, Zstandard, BLAS, and LAPACK.

    Query includes a SQL-like intermediate representation of data frame operations, a query planner, a Python-like intermediate representation of expressions, a compiler targeting JVM bytecode, custom native memory representations, custom partitioned binary file formats, a library of approximate, statistical, and linear algebraic methods, and a distributed sorting algorithm.

    Batch is implemented in Python and deployed on Kubernetes.

    Batch relies on many technologies including:

    OCI container images, crun, Google and Azure cloud storage, Google and Azure VM APIs, Google and Azure container registry APIs, Grafana, Prometheus, OAuth2, MySQL, Envoy, and asyncio.

    ,

    About the Role

    You create an environment of empathy, mentorship, accountability, and excellence that empowers your engineers to thrive. You liaise with scientific, institutional, and external partners in order to understand and anticipate their analytical needs.

    Working with scientists and engineers, you co-design the product roadmap, creating the conditions for both quotidian and groundbreaking scientific projects.

    Working with engineers, you develop technical roadmaps and processes to manage code quality.


    Specifically, you will:
    Manage, mentor, coach, and lead a team of six engineers.
    Partner with scientists to ensure the ongoing success of scientific projects.
    Work across the institution to anticipate and address issues impeding science.
    Develop the technical expertise and career of your engineers.
    With scientists and engineers, plan a product roadmap.
    With engineers, develop a technical roadmap.
    Develop a hiring plan and mix of roles to meet the future needs of the team.
    Requirements

    2+ years of experience managing engineering teams that develop, maintain, and operate distributed systems, databases, or data systems.
    Substantial practical engineering experience with one of distributed systems, databases, or data infrastructure.
    Strong written and verbal communication skills.
    Ability to strike a balance of processes making an engineering team effective.
    Experience working cross-functionally and collaboratively.
    B.S. or B.A. in Computer Science, or equivalent experience.
    The Broad Institute will not offer visa sponsorship for this opportunity.

    All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity,

    national origin, disability or protected veteran status.

    Check out this video for a look into our community
    #J-18808-Ljbffr