- Architect and build large scale distributed web crawler system.
- Design and implement web crawlers and scrapers to automatically extract data from websites, handling challenges like dynamic content and scaling to large data volumes.
- Develop data acquisition pipelines to ingest, transform, and store large volumes of data.
- Develop a highly scalable system and optimize crawler performance.
- Monitor and troubleshoot crawler activities to detect and resolve issues promptly.
- Work closely with data infrastructure and data researcher to improve the quality of the data.
- Previous large scale web crawling experience is a must for this role.
- Minimum of 5 years of experience in data-intensive applications and distributed systems.
- Proficiency in high performance programming languages like Go or Rust or C++.
- Strong understanding of orchestration and containerization frameworks like Docker / Kubernetes.
- Experience building on GCP or AWS services.
- Bonus: You have deep expertise working with headless browsers and Chrome DevTools Protocol.
- Bonus: You are curious to learn and develop understanding of how data sources and quality affects LLM capabilities.
-
InsideHigherEd Stanford, United StatesAssociate Director, Research Collections and Data Acquisition Graduate School of Business, Stanford, California, United States Library Mar 07, 2024 Post Date 102496 Requisition #Stanford Graduate School of BusinessStanford's Graduate School of Business has built a global re ...
-
Stanford University Stanford, United StatesStanford Graduate School of Business · Stanford's Graduate School of Business has built a global reputation based on its immersive and innovative management programs. We provide students a transformative leadership experience, pushing the boundaries of knowledge with faculty rese ...
-
Stanford University Stanford, United StatesStanford Graduate School of Business Stanford's Graduate School of Business has built a global reputation based on its immersive and innovative management programs. We provide students a transformative leadership experience, pushing the boundaries of knowledge with faculty resear ...
-
Stanford University Stanford, United StatesStanford Graduate School of Business Stanford's Graduate School of Business has built a global reputation based on its immersive and innovative management programs. We provide students a transformative leadership experience, pushing the boundaries of Director, Collections, Resear ...
-
Software engineer- Data Acquisition
2 weeks ago
Luma AI Palo Alto, United StatesLuma's mission is to build multimodal AI to expand human imagination and capabilities. · We believe that multimodality is critical for intelligence. To go beyond language models and build more aware, capable and useful systems, the next step function change will come from vision. ...
-
SEM Specialist at beBee
1 week ago
beBee Professionals San Francisco, United States Full time $60,000 - $80,000About beBee: · beBee is a pioneering professional networking platform where professionals from around the globe connect, share, and grow. By leveraging innovative tools and technologies, beBee ensures a dynamic environment for professional development and networking. · About the ...
-
Stanford University Palo Alto, CA, United StatesAssociate Director, Research Collections and Data Acquisition · Stanford's Graduate School of Business has built a global reputation based on its immersive and innovative management programs. We provide students a transformative leadership experience, pushing the boundaries of k ...
-
Head of Growth at Ramen Hero
2 weeks ago
Monfefo LLC San Francisco, United StatesAt Ramen Hero, we're passionate about bringing authentic, fun, new ramen experience to everyone, whenever and wherever they want it. Our team is committed to making real ramen accessible.The Head of Growth is responsible for managing all growth activities for Ramen Hero with a st ...
-
Director, Performance Marketing, Beyond Yoga
3 days ago
Levi Strauss & Co San Francisco, United States Full timeJOB DESCRIPTION · The Director of Performance Marketing will manage all customer acquisition and conversion efforts for the Beyond Yoga E-Commerce business. You will partner with marketing, design, finance, operations, external agencies and the rest of the leadership team. You ...
-
Associate Acquisitions Editor – HCR
3 weeks ago
HarperCollins Publishers Silicon Valley, United StatesOverview · HarperChristian Resources is committed to building a diverse and inclusive team and highly values diverse backgrounds and insights that fuel our innovation. · HarperChristian Resources (HCR) is looking for an Associate Acquisitions Editor to join our dynamic team. HCR ...
-
Digital Forensic Technician
5 days ago
Locke and McCloud San Francisco, United StatesRole: Digital Forensic Technician · Location: San Francisco, CA · Position Type: Hybrid · Salary: $65,000 - $80,000 + Benefits · About the Company: · Our client, a leading national transport logistics company based in San Francisco, is seeking a dedicated and skilled Forensic Tec ...
-
Publisher Sales Manager
2 weeks ago
Zeta Global San Francisco, United StatesDisqus is the world's leading commenting platform installed by millions of publishers, we've built a growing community platform and monetization solution for publishers. We're looking for an analytical sales-oriented person to work on one of the largest publisher platforms. This ...
-
Recruiter
6 days ago
StyleAI San Francisco, United StatesAbout Us · StyleAI is the AI-powered, all-in-one unified marketing platform for businesses and ambitious marketing teams. Thousands of companies rely on Style to manage their SEO strategies, Google Ads campaigns, and websites in real time. Based in San Francisco, StyleAI offers t ...
-
Senior Analytics Engineer
4 weeks ago
Scale San Francisco, United StatesYou will: · Work with operations, finance, and engineering to drive the development of pipelines that provide single-source-of-truth foundational accuracy · Continually improve ongoing data pipelines and simplify self-service support for business stakeholders · Perform regular s ...
-
Retail Marketing Manager
3 weeks ago
Motion Recruitment San Francisco, United StatesOur Client, a Global Retail Brand Company, is looking for a Performance Marketing Manager to join their team REMOTLEY in San Francisco, CA · FULLY REMOTE: Candidates Local To San Francisco Will Be Hybrid But Be Given Preference In Interview Process · Pay: $45-48/hour · ***This Is ...
-
Marketing Manager
2 weeks ago
TalentBurst, an Inc 5000 company San Francisco, United StatesTitle : Marketing Manager · Location: San Francisco, CA · Duration: 6+ Months · Job Description: · The marketer brings deep expertise to manage these channels, and is passionate about leveraging best practices, building testing opportunities and roadmap, owning budget managemen ...
-
Senior Business Development Manager
1 day ago
Disney Direct to Consumer San Francisco, United States RegularJob Summary: · About the Role & Team · Disney's Direct to Consumer (DTC) team oversees the Disney+, Hulu and ESPN+ streaming businesses within Disney Entertainment helping to bring The Walt Disney Company's (TWDC) best-in-class storytelling to fans and families everywhere. · Th ...
-
Data Solutions Architect
2 weeks ago
Asana San Francisco, United States| · | · Data Solutions Architect · San Francisco We are looking for a Senior Solutions Architect within the Enterprise Technology function who will shape the strategy, architecture, design, and implementation of scalable solutions for the Data Intelligence team. We're seeking a ...
-
Middle Market Commercial Loan Officer
2 weeks ago
MRINetwork Jobs San Francisco, United StatesJob Description · Job Description · Actively seeking an experienced commercial lender to join a team of successful professionals who will collectively develop creative financing solutions to resolve complex and often unique issues for local businesses with revenues $20MM-1billi ...
-
Senior Associate, Strategy
2 weeks ago
DoorDash San Francisco, United StatesAbout the Role · The Strategy & Operations Senior Associate role is for individuals with an ownership mentality who enjoy getting into the weeds and really understanding some of our toughest problems, developing a solution, collaborating across the organization, and executing to ...
Member of Technical Staff: Data Acquisition - San Francisco, United States - essential AI
Description
Job Description
Job DescriptionEssential AI's mission is to deepen the partnership between humans and computers, unlocking collaborative capabilities that far exceed what could be achieved today. We believe that building delightful end-user experiences requires innovating across the stack - from the UX all the way down to models that achieve the best user value per FLOP.
We believe that a small, focused team of motivated individuals can create outsized breakthroughs. We are building a world-class multi-disciplinary team who are excited to solve hard real-world AI problems. We are well-capitalized and supported by March Capital and Thrive Capital, with participation from AMD, Franklin Venture Partners, Google, KB Investment, NVIDIA.
The RoleThe Data Acquisition (Crawler) Engineer will be responsible for developing and maintaining the systems that allow for the smooth and efficient collection, storage, and processing of data from various sources. Your primary responsibility will be to design, develop, and maintain web crawlers and data acquisition systems in an efficient and reliable manner to support our model training.
What you'll be working onWe encourage you to apply for this position even if you don't check all of the above requirements but want to spend time pushing on these techniques.
We are based in-person in SF. We offer relocation assistance to new employees.