- Curate high-quality datasets and synthesize training data where needed to improve model capabilities.
- Championing modelling , EDA, Transformation, Modernization and Curation of high-quality training data for GPT-4 and GPT-4 Vision
- Providing data curation leadership on tabular, unstructured (images, video, logs files etc.) data
- Creating data definitions and data lineages for effective LLM training for high accuracy
- Helping build and test prompts to render high quality insights
- Train and fine-tune language models using frameworks like PyTorch and TensorFlow
- Rigorously test models to evaluate accuracy, bias, toxicity, and other attributes using statistical analysis
- Monitor metrics and logs from LLMs in deployment to proactively identify any degraded performance or anomalies.
- Diagnose root causes when models err or behave unexpectedly using techniques like saliency maps, heatmap visualizations and interactive debugging.
- Improve model robustness by analyzing model behavior and identifying failure modes. Recommend data augmentation, training modifications etc.
- Perform model surgery by carefully editing model weights and architectures to fix incorrect or unsafe behavior while maintaining performance.
- Run A/B experiments to measure impact of model tweaks and fixes on performance, accuracy, toxicity, bias etc.
- Continuously inspect models for signs of concept drift or staleness and recommend retraining cadence.
- Document LLM version changes, experiments, and incident response postmortems.
- Stay updated on the latest techniques from research and industry conferences for responsible and reliable deployment of LLMs.
- 8+ years experience training, deploying and monitoring natural language models
- Strong stats skills and large-scale data manipulation capabilities
- Proficiency of Azure Machine Learning Studio and deploying models in Azure Cloud environments
- Deep knowledge of Azure SQL and vector databases
- Proficiency in Python, PyTorch, TensorFlow, NLP libraries and other ML tools
- Knowledge of responsible AI principles around transparency, fairness and accountability
- Monitor metrics and logs from LLMs in deployment to proactively identify any degraded performance or anomalies.
- Diagnose root causes when models err or behave unexpectedly using techniques like saliency maps, heatmap visualizations and interactive debugging.
- Improve model robustness by analyzing model behavior and identifying failure modes. Recommend data augmentation, training modifications etc.
- Perform model surgery by carefully editing model weights and architectures to fix incorrect or unsafe behavior while maintaining performance.
- Run A/B experiments to measure impact of model tweaks and fixes on performance, accuracy, toxicity, bias etc.
- Continuously inspect models for signs of concept drift or staleness and recommend retraining cadence.
- Document LLM version changes, experiments, and incident response postmortems.
- Stay updated on the latest techniques from research and industry conferences for responsible and reliable deployment of LLMs.
- Knowledge of Autogen, LangChain/Llama Index frameworks
-
Data Scientist
3 weeks ago
Achmea Sunnyvale, United StatesDe afdeling · De afdeling Product- en Portfoliomanagement Schade Bedrijven bestaat uit productmanagers, data analisten, pricing analisten, data scientists, business intelligence en scrum masters, in totaal zo'n 60 collega's. Deze professionals werken voor de merken Centraal Behee ...
-
Data Scientist, modelbouwer
1 week ago
Achmea Sunnyvale, United StatesDe afdeling · Je komt te werken bij de divisie Centraal Beheer, waar innovatie, digitalisering én data centraal staan. In 2023 zijn we uitgeroepen tot Data & Insights Company of the Year. Onze afdeling Data & Analytics (D&A) telt ruim 100 collega's en is opgesplitst in de teams D ...
-
Data Scientist
5 days ago
Tata Consultancy Services San Jose, United StatesJob Title · Data Scientist - NLP, ML & Deep Learning · "ROLE" as per TCS Role Master · Data Scientist · Candidate for this position to be offered with TAIC or TCSL as Entity · TCSL · Relevant Experience · (in Yrs) · 8 years + · Must Have Technical/Functional Skills · Machine L ...
-
Data Scientist
2 weeks ago
Mindlance San Jose, United StatesTitle – Product Analytics (Data Scientist) · Duration – 12+ months · Location – Remote · Details: · Drive analysis on various projects for *** Express to improve the experience of our users. · Own data analysis, visualization and communicate the outcomes/insights to various stak ...
-
Data Scientist
5 days ago
Cypress HCM San Jose, United StatesThis is an exciting opportunity to join a growing global company in the cloud-based software industry. This is a hybrid position. We are looking for a talented, enthusiastic and dedicated person to support the Fraud Risk Strategy team. The incumbent will be responsible for suppor ...
-
Data Scientist
1 hour ago
Anthro San Jose, United StatesJob Description · Job DescriptionAbout the JobAs a Data Scientist at Anthro Energy, you will play a pivotal role in leveraging data to drive innovation in lithium-ion electrolyte technology. You will be responsible for structuring data and developing predictive models to accelera ...
-
Data Scientist
1 week ago
Syrencloud LLC San Jose, United StatesJob Description · Job DescriptionTeam Role · This team focuses on building data / ML services for our advertiser sellers, to guide them ways to optimize for their ad budget and goals, for example by recommending the right inventory, keywords, budget, and bids to apply for their c ...
-
Data Scientist
3 weeks ago
Cerebra Consulting Inc Sunnyvale, United StatesJob Description · Job DescriptionHello, · Hope you are doing great. I am Vinod Gotla, IT Recruiter from Cerebra Consulting Inc. · One of our Clients is looking for a below position. Please read the below job description and let me know your comfortability. · Job Title: Data Sci ...
-
Staff Data Scientist
19 hours ago
Adobe San Jose, United States Full timeOur Company · Changing the world through digital experiences is what Adobe's all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences We're passionate about empowering people to create beautiful ...
-
Data Scientist
4 weeks ago
Sigmaways Inc Santa Clara, United StatesJob Description · Job DescriptionDuties: · We are looking for a highly motivated Principal Software Engineer to help us build cutting edge analysis, visualization and compute pipelines for analyzing sequencer data. · The job requires advanced python expertise and data science sk ...
-
Data Scientist
1 week ago
Aquent Campbell, United States TemporaryTeam Role · This team focuses on building data / ML services for our advertiser sellers, to guide them ways to optimize for their ad budget and goals, for example by recommending the right inventory, keywords, budget, and bids to apply for their campaigns, and continuously optimi ...
-
Senior Data Scientist
1 day ago
LatentView Analytics San Jose, United StatesLatentView Analytics is a leading global analytics and decision sciences provider, delivering solutions that help companies drive digital transformation and use data to gain a competitive advantage. With analytics solutions that provide a 360-degree view of the digital consumer, ...
-
Senior, Data Scientist
6 days ago
Walmart Sunnyvale, United States Full timePosition Summary... · What you'll do... · Are you a hardcore numbers person who would enjoy solving some of business' toughest challenges by illuminating phenomena and opportunity held within one of the world's largest data sets? · As a Staff Data Scientist at Walmart, the mamm ...
-
Senior, Data Scientist
3 days ago
Walmart Sunnyvale, United States Full timePosition Summary... · What you'll do... · Walmart Global Tech is looking to hire a seasoned Senior Data Scientist for their Personalization team in Bellevue/Sunnyvale . The Personalization team consists of platform, application engineers, scientists and product visionaries all ...
-
Staff, Data Scientist
4 days ago
Walmart Sunnyvale, United States Full timePosition Summary... · What you'll do... · As a Staff, Data Scientist, you will be a Technical lead on Data science projects and work with marketing stake holders to take ownership of implementation and delivery responsibilities. You will develop "intelligent" features and deliver ...
-
Staff, Data Scientist
1 week ago
Walmart Sunnyvale, United States Full timePosition Summary... · What you'll do... · We are seeking an experienced Senior Data Scientist to join our Walmart eCommerce Product Analytics team. As a member of our team, you will be responsible for developing and executing data-driven strategies to improve our eCommerce produc ...
-
Senior, Data Scientist
3 weeks ago
Walmart Sunnyvale, United States Full timePosition Summary... · What you'll do... · Walmart Golbal Tech is looking to hire a Senior Data Scientist for our Sunnyvale/Bellevue location. You would be working on challenging problems leveraging to generate tangible and immediate business impact. As a Senior Data Scientis ...
-
Data Scientist
1 week ago
Amazon Development Center U.S., Inc. Santa Clara, United StatesBachelor's degree in a quantitative field such as statistics, mathematics, data science, business analytics, economics, finance, engineering, or computer science · - 2+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical soft ...
-
Staff, Data Scientist
1 week ago
Walmart Sunnyvale, United States Full timePosition Summary... · What you'll do... · We are seeking an experienced Staff Data Scientist to join our Walmart eCommerce Product Analytics team. As a member of our team, you will be responsible for developing and executing data-driven strategies to improve our eCommerce product ...
-
Staff, Data Scientist
2 weeks ago
Walmart Sunnyvale, United States Full timePosition Summary... · What you'll do... · Walmart Global Tech Applied AI team is looking for a passionate Staff Data Scientist to join world's largest retailer to shape the future of retail by working on . You will have the chance to leverage advanced to tackle a wide range of ...
Data Scientist - Santa Clara, United States - Net2Source Inc.
![Net2Source Inc. background](https://contents.bebee.com/companies/us/net2source-inc/background-NDcOp.png)
Description
Data Scientist
Location - Santa Clara, CA (Hybrid)
Contract/ Full Time
This is data scientist role focused on transforming complex and large (100s of terabytes) multi-dimensional data e.g. tabular(relational) , unstructured data such as images, videos, audio files and other various file formats. The key responsibility is to be able to curate high quality training data for the large language model training.
Responsibilities:
Requirements: