- Partner with customers to architect and build production AI inference pipelines on the Gimlet platform, optimizing for latency, throughput, and cost.
- Implement and optimize model deployments including LLMs, diffusion models, and custom architectures using techniques like quantization, batching, and caching.
- Debug complex performance issues across the full stack-from model architecture to GPU kernels to networking.
- Build reference implementations, technical content, and tooling that showcase Gimlet's capabilities, design and run sophisticated demos and POCs.
- Create evaluation and benchmark harnesses, regression checks that preserve model quality as performance improves.
- Deliver actionable, high-impact feedback to internal teams to drive platform improvements aligned with customer needs.
- Build and maintain trusted relationships with customer leaders and stakeholders to ensure successful deployment and scaling.
- Hands-on experience with production ML model deployment, inference optimization, or ML infrastructure
- Familiarity with the AI/ML stack: PyTorch, transformers, LLM serving frameworks (vLLM, TensorRT-LLM, TGI), or similar
- Experience with infrastructure services (e.g., Kubernetes, SLURM), infrastructure-as-code tools (e.g., Ansible), container platforms (e.g., Docker), scripting/programming languages (e.g., Python), and observability, tracing/logging tools.
- Strong understanding of GPU computing, model optimization techniques (quantization, batching, KV caching), and distributed systems fundamentals
- Ability to debug complex technical issues across hardware and software layers
- Strong written and verbal communication skills-you can explain complex technical concepts clearly
- Comfort with ambiguity and a bias toward action in a fast-paced startup environment
- Contributions to open-source ML projects or frameworks.
- Experience with AI accelerators, custom hardware, or datacenter infrastructure
- Background in performance engineering, profiling, or low-level optimization.
-
This is a role posted by Reval Recruiting on behalf of a client Deployed Engineer This role sits at the intersection of engineering customer engagement and product innovation You ll work directly with companies building cutting-edge LLM applications helping them turn ideas into p ...
San Francisco $150,000 - $270,000 (USD)3 weeks ago
-
This is a role posted by Reval Recruiting on behalf of a client · Deployed Engineer · Location: San Francisco or New York · This role is posted on behalf of the following client: They are a pioneering force in artificial intelligence, redefining how developers and organizations b ...
San Francisco $150,000 - $270,000 (USD)22 hours ago
-
Deployed Engineer · On-site in San Francisco/NYC/Austin · Base $150-200k + $125K OTE + Significant Equity · FutureTech Recruitment is partnered with a $1B funded Series C applied AI lab building end-to-end software agents. With 200 people and a talent-dense team of world-class co ...
San Francisco $95,000 - $165,000 (USD) per year11 hours ago
-
Employment Type: Full-time · Location: On-site (New York, San Francisco, or Austin, TX) · Compensation: $150K – $200K base · Equity: Extremely competitive · Visa Sponsorship: Available · About the Role · We are an applied AI lab building end-to-end software agents designed to act ...
San Francisco $95,000 - $165,000 (USD) per year1 week ago
-
Title of Role: Deployed Engineer · Location: New York, NY | San Francisco, CA | Austin, TX · Company Stage of Funding: Late-Stage Venture (Series C+) · Office Type: Onsite – 5 Days per Week · Salary: $150,000 – $200,000 Base (OTE up to $325,000+, based on experience, qualificatio ...
San Francisco $95,000 - $165,000 (USD) per year3 days ago
-
About LangChain · We provide the agent engineering platform and open source frameworks developers need to ship reliable agents fast. · ...
San Francisco, CA1 month ago
-
About LangChain At LangChain our mission is to make intelligent agents ubiquitous We provide the agent engineering platform and open source frameworks developers need to ship reliable agents fast Our open source frameworks LangChain and LangGraph see over 90 million downloads per ...
San Francisco1 month ago
-
About Us · Parallel is a web infrastructure company. Our products are used by leading businesses in sales, marketing, insurance, and coding to build best-in-class AI agents with flexible and powerful programmatic access to the web. · We've raised $130 million from Kleiner Perkins ...
San Francisco $95,000 - $165,000 (USD) per year1 day ago
-
We are an applied AI lab building end-to-end software agents. We're the makers of Devin, the first AI software engineer, and Windsurf, the AI-native IDE. Together, they represent our vision for collaborative AI teammates that enable engineers to focus on more interesting problems ...
San Francisco1 month ago
-
We're looking for a Forward Deployed Engineer to work directly with customers to deploy and operationalize depthfirst's security AI agents in their environments. · ...
San Francisco2 weeks ago
-
Forward Deployed Engineer – LanceDB · Location: San Francisco Bay Area (In-Person / Hybrid) · Team: Engineering · Job Type: Full-Time · About LanceDB · LanceDB is an open-source, cloud-native vector database and multimodal AI lakehouse built on a high-performance columnar format. ...
San Francisco $115,000 - $210,000 (USD) per year1 week ago
-
+Job Summary · This position is eligible for Medical, Dental and Vision + in office perks. · +ResponsibilitiesExecutive workshops strategy sessions value realization planning · Acting as trusted advisor and forward-deployed change leader. · ...
San Francisco2 weeks ago
-
Hiring for a hybrid engineer + solutions architect + operator who works directly with enterprise customers to deploy, customize, and scale AI voice agents in real production environments. · ...
San Francisco, California, US / Pittsburgh, Pennsylvania, US $115,000 - $210,000 (USD) per year1 week ago
-
We are working with a fast-growing AI infrastructure company building tools that help organizations safely deploy, monitor, and govern large language models in real-world environments.Deploy and customize AI infrastructure in customer environments · Build integrations with custom ...
San Francisco1 month ago
-
· We're building the world's first AI native BPO, starting with healthcare RCM. We're focused on helping healthcare enterprises like physician groups, EMR providers, and BPOs accelerate AR, and get reimbursed quickly and cost effectively. · We're looking for an exceptional Forwa ...
San Francisco, California, US1 month ago
-
We are an applied AI lab building end-to-end software agents that enable engineers to focus on more interesting problems and empower engineering teams to strive for more ambitious goals. · ...
San Francisco3 weeks ago
-
We enable companies operating at the frontier of AI to bring cutting-edge models into production. Join us and help build the platform engineers turn to to ship AI products. · As a Forward Deployed Engineer at Baseten, you will partner directly with customers to architect, build, ...
San Francisco $160,000 - $275,000 (USD)2 weeks ago
-
Forward Deployed Engineer (Full-Stack) · Role: Forward Deployed Engineer (Post-Sales & Product Implementation) · Location: Remote (US) / San Francisco · Company: Venture-Backed AI Infrastructure Startup · About the Company · They are building the mission-critical infrastructure ...
San Francisco $115,000 - $210,000 (USD) per year6 days ago
-
About the role As a Forward Deployed Engineer in Cyberhaven's R&D organization you will embed deeply with enterprise customers deploying Cyberhaven's Content Inspection stack in their own cloud or datacenter environments AWS GCP Azure Kubernetes You operate where real customer en ...
San Francisco Full time1 month ago
-
WRITER is where the world's leading enterprises orchestrate AI-powered work. Our vision is to expand human capacity through super intelligence. · ...
San Francisco1 month ago
-
This new generation of applications need fast, secure access to data across disparate systems.We simplify this by unifying query federation, search, and AI into a single runtime for building intelligent applications. · Our mission is to make building AI-powered software as easy b ...
San Francisco3 weeks ago
Founding Forward Deployed Engineer - San Francisco - Gimlet Labs
4 hours ago
Description
Gimlet Labs is building the foundation for the next generation of AI applications. As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck. Gimlet is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, efficiency, and model quality.
Gimlet pairs its inference stack with a seamless developer experience, allowing users to deploy, manage, and monitor AI workloads from frameworks like PyTorch and LangChain at production scale in seconds. The founding team has deep experience across AI, distributed systems, and hardware with previous successful exits.
We are seeking our very first Forward Deployed Engineer to work hands-on with our customers, solving complex AI inference challenges and building production-grade ML solutions on the Gimlet platform. This is a deeply technical role where you'll partner directly with ML engineers at cutting-edge AI companies to optimize model performance, architect inference pipelines, and push the boundaries of what's possible with our technology.
This role is ideal for engineers who thrive at the intersection of systems engineering and applied ML, who want direct exposure to how companies are actually deploying AI in production, and who are energized by solving hard technical problems with immediate customer impact.
Responsibilities
Required
-
Deployed Engineer
Only for registered members San Francisco
-
Deployed Engineer
Only for registered members San Francisco
-
Deployed Engineer
Only for registered members San Francisco
-
Deployed Engineer
Only for registered members San Francisco
-
Deployed Engineer
Only for registered members San Francisco
-
Deployed Engineer
Only for registered members San Francisco, CA
-
Deployed Engineer
Only for registered members San Francisco
-
Deployed Engineer
Only for registered members San Francisco
-
Deployed Engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Only for registered members San Francisco, California, US / Pittsburgh, Pennsylvania, US
-
Forward Deployed Engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Only for registered members San Francisco, California, US
-
Partner Deployed Engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Only for registered members San Francisco
-
Forward Deploy Engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Full time Only for registered members San Francisco
-
Forward deployed engineer
Only for registered members San Francisco
-
Forward Deployed Engineer
Only for registered members San Francisco