- Build, scale, and secure the HPC clusters within Meta research labs, a heterogeneous environment containing diverse operating systems and applications
- Provide on-call support and lead incident root cause analysis through multiple infrastructure layers (compute, storage, network) for HPC clusters and act as a final escalation point
- Collaborate in a diverse team environment across multiple scientific and engineering disciplines, making the architectural tradeoffs required to rapidly deliver software and infrastructure solutions
- Find ways to leverage the scale and complexity of the larger Meta production infrastructure to solve problems for Reality Lab researchers
- Provide guidance to other engineers on best practices to build mature services which are highly available, reliable, secure, and scalable
- Provide guidance to other engineers on best practices to build mature services which are highly available, reliable, secure, and scalable
- Ability to work independently, handle large projects simultaneously, and prioritize team roadmap and deliverables by balancing required effort with resulting impact
- Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
- Experience in automating the management of infrastructure and services
- 3+ years experience in distributed system performance measurement, logging, and optimization
- 3+ years experience coding in at least one of the following languages: C++, Python, Rust, or Go
- Thorough understanding of Linux operating system internals, including the networking subsystem
- Experience with Python library management systems such as Conda or Python venv
- Experience in writing system level infrastructure, libraries, and applications
- Experience with software development practices such as source control, code reviews, unit testing, debugging and profiling
- Proven track record of shipping software
- Experience in developing performant software and systems
- Experience with managing HPC scheduler libraries like Slurm, Kubernetes, or LSF
- Prior experience in building out HPC clusters, handling compute, storage, network, operating systems, schedulers, and stakeholder discussions
- Prior experience in cluster oncall operations, including troubleshooting server/scheduler/storage errors, maintaining compute/storage environments/libraries/tools, helping onboard users to the cluster, and answering general questions from users
- Prior experience in cluster coordination and strategy planning, including collecting/understanding needs of users, developing tools to improve user experience, providing guidance on best practices, coordinating distribution of compute/storage resources, forecasting compute/storage needs, and developing long-term user experience/compute/storage strategies
- Prior experience building tooling for monitoring and telemetry
- Prior experience supporting configuration management in a multi-region environment
- Prior experience optimizing multi-tenant HPC clusters for performance and maintenance
- Prior experience with containerization technologies like Docker or Virtual Machines
- Prior experience building services
- Prior experience building PaaS or internal clouds
- Prior experience in developing/managing distributed network file systems
- Prior academic or development experience with machine learning and/or deep learning
- Prior experience in ML libraries such as PyTorch, TensorFlow or cuDNN
- Prior experience in GPGPU development with CUDA, OpenCL or DirectCompute
- Prior experience in network security
- Experience in database and data management systems at scale
- Familiar with Linux observability tools, such as eBPF
-
Software Engineer
1 week ago
eNGINE Pittsburgh, United StateseNGINE builds Technical Teams. We are a Solutions and Placement firm shaped by decades of interaction with Technical professionals. Our inspiration is continuous learning and engagement with the markets we serve, the talent we represent, and the teams we build. Our Consulting Wor ...
-
Software Engineer
3 days ago
Resilient Cognitive Solutions Pittsburgh, United StatesJoin a team of fearless developers working on true joint cognitive systems that make a real impact. If you thrive on solving important, hard problems that others believe are impossible, and if you want to do work that truly matters, we have an exciting opportunity for you. · At R ...
-
Software Engineer,
3 weeks ago
Latitude AI Pittsburgh, United StatesLatitude AI ) is an automated driving technology company developing a hands-free, eyes-off driver assist system for next-generation Ford vehicles at scale. We're driven by the opportunity to reimagine what it's like to drive and make travel safer, less stressful, and more enjoyab ...
-
Software Engineer
1 week ago
eNGINE Pittsburgh, United StateseNGINE builds Technical Teams. We are a Solutions and Placement firm shaped by decades of interaction with Technical professionals. Our inspiration is continuous learning and engagement with the markets we serve, the talent we represent, and the teams we build. Our Consulting Wor ...
-
Software Engineer
17 hours ago
Aurora Innovation Pittsburgh, United StatesAurora hires talented people with diverse backgrounds who are ready to help build a transportation ecosystem that will make our roads safer, get crucial goods where they need to go, and make mobility more efficient and accessible for all. We're looking for experienced engineers t ...
-
Software Engineer
17 hours ago
Dice Pittsburgh, United StatesDice is the leading career destination for tech experts at every stage of their careers. Our client, DVI Technologies, Inc., is seeking the following. Apply via Dice today · Supplier Note: The need for this role is being updated as the manager wants to find 2 contractors with ne ...
-
Software Engineer
1 week ago
System One Pittsburgh, United StatesJob Title: Technology Engineer · Job Location: Pittsburgh, PA, 3 days onsite · Duration: 6 months with possible extension · Job Summary: · Responsible for writing programs to maintain and control computer systems software for operating systems, networked systems, and database ...
-
Software Engineer
3 weeks ago
Oxford Solutions Pittsburgh, United StatesDirect hire opportunity for a Software Engineer. Onsite in Pittsburgh, PA. · US Citizenship Required. Ability to obtain TS/SCI clearance. · In this role you will work as part of a multidisciplinary team, under the supervision of a principal software developer. You will develop ...
-
Software Engineer
3 weeks ago
Carnegie Mellon University Pittsburgh, United StatesThe CERT division of the Software Engineering Institute (SEI), a federally funded research and development center at Carnegie Mellon University in Pittsburgh, Pennsylvania, engages in state-of-the-art research and development and provides robust solutions focused on ensuring that ...
-
Software Engineer
1 day ago
Pantherx Specialty LLC Pittsburgh, United StatesJob Description · Job Description7,000 Diseases - 500 Treatments - 1 Rare Pharmacy · PANTHERx is the nation's largest rare disease pharmacy, and we put the patient experience at the top of everything that we do. · If you are looking for a career in the healthcare field that embr ...
-
Software Engineer
3 weeks ago
EVERTZ Pittsburgh, United StatesQuintech Electronics & Communications, Inc. is seeking Software Engineers whose primary duties will include designing and testing the latest broadcast and signal processing equipment incorporating the most advanced technology. · Responsibilities: · Design, develop, test, deploy ...
-
Software Engineer
17 hours ago
Cyient Pittsburgh, United StatesCyient is one of the world's leading rail engineering solutions partner repeatedly trusted by rail majors to address complex engineering challenges across the design-build-maintain life cycle. Our Design solutions include rolling stock project and product engineering support and ...
-
Software Engineer
2 weeks ago
Bodo Inc Pittsburgh, United StatesAt Bodo, we are driven by a mission to revolutionize how organizations harness the power of data by democratizing efficient compute at scale. With the creation of the first compute engine that brings HPC levels of performance and efficiency to large-scale data processing, we have ...
-
Software Engineer
17 hours ago
Nuix Pittsburgh, United StatesNuix creates innovative software that empowers organizations to simply and quickly find the truth from any data in a digital world. We are a passionate and talented team, delighting our customers with software that transforms data into actionable intelligence. · At Nuix, we hire ...
-
Software Engineer
1 week ago
Aurora CO Pittsburgh, United StatesWho We Are · Aurora (Nasdaq: AUR) is delivering the benefits of self-driving technology safely, quickly, and broadly to make transportation safer, increasingly accessible, and more reliable and efficient than ever before. The Aurora Driver is a self-driving system designed to op ...
-
Software Engineer
3 weeks ago
Comcast Pittsburgh, United StatesFreeWheel, a Comcast company, provides comprehensive ad platforms for publishers, advertisers, and media buyers. Powered by premium video content, robust data, and advanced technology, we're making it easier for buyers and sellers to transact across all screens, data types, and s ...
-
Software Engineer
1 day ago
Lovelace AI Pittsburgh, United StatesJob Description · Job DescriptionAbout Us: · Lovelace AI was born from the desire to apply state of the art AI and systems engineering to the question of human safety, especially in dangerous conditions such as conflict, disaster response, anti-terrorism and deterrence against AI ...
-
Software Engineer
1 week ago
Aurora Innovation Pittsburgh, United StatesWho We Are · Aurora (Nasdaq: AUR) is delivering the benefits of self-driving technology safely, quickly, and broadly to make transportation safer, increasingly accessible, and more reliable and efficient than ever before. The Aurora Driver is a self-driving system designed to op ...
-
Software Engineer
1 week ago
Evertz Microsystems Pittsburgh, United StatesQuintech Electronics & Communications, Inc. is seeking Software Engineers whose primary duties will include designing and testing the latest broadcast and signal processing equipment incorporating the most advanced technology. · Responsibilities: · Design, develop, test, deploy ...
-
Software Engineer
4 weeks ago
Quintech Electronics & Communications Inc Pittsburgh, United StatesJob Description · Job DescriptionQuintech Electronics & Communications, Inc. is seeking Software Engineers whose primary duties will include assisting customers by implementing and adapting the latest communications and signal processing. · SEEKING LOCAL CANDIDATES IN PITTSBURGH, ...
Software Engineer - Pittsburgh, United States - META
![Meta background](https://contents.bebee.com/companies/us/meta/background-14hyP.png)
Description
Reality Labs Research (RL-R) brings together a diverse and highly interdisciplinary team of researchers and engineers to create the future of augmented and virtual reality. On the Codec Avatars ML Compute team, you'll work on building tools, libraries, and frameworks that will help researchers collaborate with each other and empower their research towards the generation of Codec Avatars. Our team cultivates an honest and considerate environment where self-motivated individuals thrive. We encourage a strong sense of ownership and embrace the ambiguity that comes with working on the frontiers of research. In this software engineer role on the Codec Avatar ML Compute team, you will serve as the point of contact for Meta's research GPU super clusters, managing and optimizing compute resources to enable groundbreaking research in relightable avatars, full-body avatars, and generative AI for codec avatars.
Software Engineer - Codec Avatar ML Compute Team Responsibilities
Learn about how to prepare for your interview with our interview guide, tips, and interactive experiences.
Visit interview prep