-
About Periodic Labs · We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries. We are well funded and growing rapidly. Team members are owners who identity and solve problems without boundaries or bureaucracy. We eagerly learn ne ...
United States1 day ago
-
About BentoML · BentoML is a leading inference platform provider that helps AI teams run large language models and other generative AI workloads at scale. With support from investors such as DCM, enterprises around the world rely on us for consistent scalability and performance i ...
North America1 week ago
-
· Generative AI Inference Engineer · About the role: · We are seeking passionate Machine Learning Engineers to join our Inference team, focusing on the creative applications of generative AI models. The ideal candidate will have substantial experience developing and running inf ...
United States1 week ago
-
What You'll DoBuild low-latency inference pipelines for on-device deployment, enabling real-time next-token and diffusion-based control loops in robotics · Design and optimize distributed inference systems on GPU clusters, pushing throughput with large-batch serving and efficient ...
United States1 day ago
-
We are looking for an experienced Field-Applications Engineer to help deploy a new generation of code translation tools enabled by AI and modern verification techniques. · Deploy and manage containerized services using Docker. · Deploy and run Python based GenAI pipelines interac ...
United States Full time1 month ago
-
· Elevating the quality of human life through every conversation · ML Engineer - Inference · Location: United States · Experience: 5 years · About the Team: · At , our team is dedicated to revolutionizing the field of Conversation AI. We are a collaborative and innovative group ...
United States $110,000 - $190,000 (USD) per year1 week ago
- Work in company Remote job
Member of Technical Staff, Exceptional Generalist
Only for registered members
+Inferact's mission is to grow vLLM as the world's AI inference engine and accelerate AI progress by making inference cheaper and faster. · +This is a globally remote opportunity for exceptional generalist engineers who can work across the entire vLLM stack: from low-level GPU ke ...
United States1 month ago
-
About the Role · We are looking for a Member of Technical Staff (MTS) to play a key technical leadership role in designing and advancing Wind River's next‑generation intelligent systems platform. This position is ideal for an engineer who thrives at the intersection of cloud‑nati ...
United States Full time1 day ago
-
Description · Do you thrive on technical leadership and building cutting-edge AI systems? · Are you ready to drive innovation at the intersection of AI and edge computing? · Join the Akamai Inference Cloud Team · The Akamai Inference Cloud team is part of Akamai's Cloud Technolog ...
United States1 week ago
-
+Nebius is leading a new era in cloud computing to serve the global AI economy. · +Deep Technical Sourcing (LLM, Inference, Systems, GPU): Proactively identify and engage senior-level engineers and researchers across a wide range of AI/ML and systems domains.Use advanced sourcing ...
United States3 weeks ago
-
Build a Safer World. · TRM Labs provides blockchain analytics and AI solutions to help law enforcement and national security agencies, financial institutions, and cryptocurrency businesses detect, investigate, and disrupt crypto-related fraud and financial crime. TRM's blockchai ...
United States $115,000 - $195,000 (USD) per year Full time1 week ago
-
· Why work at Nebius · Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in- ...
United States1 week ago
-
NVIDIA is the platform upon which every new AI-powered application is built. We are seeking a Senior Software Engineer focused on container and cloud infrastructure. You will help design and implement our core container strategy for NVIDIA Inference Microservices (NIMs) and our h ...
United States $120,000 - $190,000 (USD) per year1 day ago
-
Description · Do you thrive on building the future of AI infrastructure? · Are you ready to lead a world-class team at the intersection of AI and edge computing? · Join the Akamai Inference Cloud Team · The Akamai Inference Cloud team is part of Akamai's Cloud Technology Group. W ...
United States $190,000 - $280,000 (USD) per year1 day ago
-
About BentoML · BentoML is a leading inference platform provider that helps AI teams run large language models and other generative AI workloads at scale. With support from investors such as DCM, enterprises around the world rely on us for consistent scalability and performance i ...
North America $115,000 - $210,000 (USD) per year1 week ago
-
· Are you a statistics expert eager to shape the future of AI? Large‑scale language models are evolving from clever chatbots into powerful engines of scientific discovery. With high‑quality training data, tomorrow's AI can democratize world‑class education, keep pace with cuttin ...
United States of America5 days ago
-
Description · Do you thrive on solving complex technical challenges in AI infrastructure? · Are you ready to architect the future of AI at the edge? · Join the Akamai Inference Cloud Team · The Akamai Inference Cloud team is part of Akamai's Cloud Technology Group. We design, imp ...
United States $190,000 - $320,000 (USD) per year1 week ago
-
We are seeking experienced AI Developers to help us shape the future of OCI Networking with AI. · This position offers an opportunity to work on cutting-edge AI applications and includes a collaborative work environment, · competitive benefits,and the chance to contribute to tran ...
United States1 month ago
-
We're looking for a scrappy, resourceful Developer Advocate to help grow Token Factory, Nebius' high-performance inference platform built for teams running real production AI workloads at scale. · Help developers know, adopt and use Token Factory for inference use cases. · Build ...
United States3 weeks ago
-
· Why work at Nebius · Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in- ...
United States $90,000 - $170,000 (USD) per year1 week ago
-
· We're looking for an AI Engineer to design, implement, and optimize advanced AI systems that balance quality, performance, and cost. You'll work on inference pipelines, retrieval-augmented generation (RAG), and multi-agent patterns while building evaluation harnesses and simul ...
United States $90,000 - $170,000 (USD) per year1 week ago
Staff Software Engineer, Inference - United States - GenesisAI
Description
Job Title
Build low-latency inference pipelines for on-device deployment, enabling real-time next-token and diffusion-based control loops in robotics
Design and optimize distributed inference systems on GPU clusters, pushing throughput with large-batch serving and efficient resource utilization
Implement efficient low-level code (CUDA, Triton, custom kernels) and integrate it seamlessly into high-level frameworks
Optimize workloads for both throughput (batching, scheduling, quantization) and latency (caching, memory management, graph compilation)
Develop monitoring and debugging tools to guarantee reliability, determinism, and rapid diagnosis of regressions across both stacks
Deep experience in distributed systems, ML infrastructure, or high-performance serving (8+ years)
Production-grade expertise in Python, with strong background in systems languages (C++/Rust/Go)
Low-level performance mastery: CUDA, Triton, kernel optimization, quantization, memory and compute scheduling
Proven track record scaling inference workloads in both throughput-oriented cluster environments and latency-critical on-device deployments
System-level mindset with a history of tuning hardwaresoftware interactions for maximum efficiency, throughput, and responsiveness
-
LLM Inference Engineer
Only for registered members United States
-
Inference Optimization Engineer
Only for registered members North America
-
Generative AI Inference Engineer
Only for registered members United States
-
Staff Software Engineer, Inference
Only for registered members United States
-
Senior Engineer
Full time Only for registered members United States
-
ML Engineer
Only for registered members United States
-
Member of Technical Staff, Exceptional Generalist
Only for registered members United States
-
Member of Technical Staff
Full time Only for registered members United States
-
Senior II Software Engineer Lead
Only for registered members United States
-
Lead Tech Recruiter
Only for registered members United States
-
Machine Learning Infrastructure Engineer
Full time Only for registered members United States
-
Lead Tech Recruiter
Only for registered members United States
-
Senior Software Engineer
Only for registered members United States
-
Senior Engineering Manager
Only for registered members United States
-
Forward Deployed Engineer
Only for registered members North America
-
Statistics Specialist
Only for registered members United States of America
-
Principal Software Engineer
Only for registered members United States
-
AI Principal Software Developer
Only for registered members United States
-
Developer Advocate
Only for registered members United States
-
Developer Advocate
Only for registered members United States
-
AI Engineer
Only for registered members United States