Lead Machine Learning Engineer - New York, United States - Acceler8 Talent

    Default job background
    Description
    Lead Machine Learning Engineer (Kubernetes, GPUs, LLMs)

    Dive into the future of artificial intelligence with a groundbreaking role that's redefining the interaction between humans and technology.

    As a Lead ML Engineer, you're not just taking a job; you're embarking on a quest to unlock new collaborative capabilities that elevate human potential to unprecedented heights.

    About Us


    We are a a well-funded Google Brain spinout ($65M Series A) that is enhancing human efficiency through human feedback and LLMs, enabling users and organizations to enhance their impact of society.

    The Role

    As a


    Lead Machine Learning Engineer , you'll be working at the intersection of research and developing, turning the theoretical into the tangible.

    You'll spearhead projects that fuse the intricacies of large language models (LLMs) with the fluidity of human interaction.

    Imagine designing the frameworks that allow for the orchestration of computing power on a colossal scale, crafting the very foundation upon which the future of AI interactions is built.


    Key Responsibilities:
    Develop cutting-edge machine learning systems and infrastructure to power the next generation of AI applications.

    Innovate in the realm of distributed systems and HPC clusters, optimizing our large language models to be both mighty and efficient.

    Delve deep into the computational engine, enhancing our training and serving platforms through novel techniques and custom kernels.
    Champion new methods of parallelism to facilitate expansive and rapid distribution training for our AI models.


    Key Qualifications:
    Proven expertise in nurturing large language models to maturity using state-of-the-art frameworks and deployment strategies.
    A knack for tuning performance to perfection, with experience in MLPerf or comparable benchmarks.
    Good understanding of GPUs and other hardware accelerators and how to get the best user value per FLOP from the hardware
    Adept in Kubernetes and containerization, with the ability to craft cloud services that scale seamlessly.


    Keywords:
    Large Language Models, Distributed Systems, HPC Clusters, Machine Learning, Artificial Intelligence, Parallelism, Performance Tuning, Kernel Languages, AI Accelerators, Cloud Services, Kubernetes, Containerization