Jobs
>
California

    A Day in the Life of a Data Scientist - California, United States - KDnuggets

    KDnuggets
    KDnuggets California, United States

    1 week ago

    Default job background
    Description


    Are you interested in what a data scientist does on a typical day of work? Each data science role may be different, but these five individuals provide insight to help those interested in figuring out what a day in the life of a data scientist actually looks like.

    A number of weeks ago I solicited feedback from my LinkedIn connections

    regarding what their typical day in the life of a data scientist consisted of.

    The response was genuinely overwhelming Sure, no data scientist role is the same, and that's the reason for the inquiry.

    So many potential data scientists are interested in knowing what it is that those on the other side keep themselves busy with all day, and so I thought that having a few connections provide their insight might be a useful endeavor.

    What follows is some of the great feedback I received via email and LinkedIn messages from those who were interested in providing a few paragraphs on their daily professional tasks.

    The short daily summaries are presented in full and without edits, allowing the quotes to speak for themselves.

    Andriy Burkov

    is Global ML Team Leader at Gartner, located in Quebec City.

    My typical day starts art 9am with a 15-30 min long Webex meeting with my team: my team is distributed, half in India (Bangalore and Chennai) and half in Canada (Quebec City).

    We discuss the advancement of the projects and decide on how to overcome difficulties.
    Then I read my emails received during the night and react if necessary. After that I work on my current project, which currently is a salary extractor from job announcements.

    I need to create a separate pair of models for each pair country-language we support (around 30 of country-language pairs).

    The process consists of dumping the job announcements for a certain part country-language, clustering them, then getting the subset of training examples.

    Then I annotate these examples manually and build the model. I iterate build/test/add data/rebuild until the test error is low enough (~98%).

    In the afternoon, I help my team members to improve their models by testing the current model on the real data, identifying the false positives/negatives and creating new training examples to fix the problem.

    The decision when to stop improving the model and deploy in production depends on the project.

    For some cases, especially user-facing, we want a very low level of false positives (less than 1%): the user always see that the extraction of some element from their text was wrong, but not always remark the lack of extraction.

    The day ends at around 17:30pm with a 30min of catch up of the tech news/blogging.
    Here's a little background on me and what a day in my life is like:

    I switched into data science and machine learning during an MD/PhD program after a joint humanities and sciences undergraduate degree, and my day-to-day projects are highly interdisciplinary as a rule.

    Some projects include simulating epidemic spread, leveraging industrial psychology to create better HR models, and dissecting data to obtain risk groups for low socioeconomic status students.

    The best part of my job is the variety of projects and a new challenge every day.

    A typical day for me starts around 8:00 am, when I catch up on my social media accounts related to machine learning and data science.

    I switch into work projects around 8:30 am and finish around 4:30 pm to 5:00 pm with a break for lunch.

    About 40% of my time is spent on research and development, with a strong focus in mathematics (topology, in particular)involving anything from developing and testing new algorithms to writing mathematical proofs to simplify data problems.

    Sometimes, the results are confidential and stay within the company (shared through monthly Lunch & Learn presentations within the company); other times, I'm allowed to publish or present at external conferences.

    Another 30% of my time is spent building relationships across departments at my company and seeking new projects, which often identify problems related operating procedures, problems related to data capture, or connections between previous projects that provide a more comprehensive view of operations.

    This is probably one of the most crucial aspects of the job.

    People I meet often bring up problems they are seeing or mention how neat it would be to have a predictive model for sales/student outcomes/operations, and I've found it opens the door to conversations and best practice suggestions down the road.

    As a data scientist, it's important to communicate with a wide range of stakeholders, and it's helped me simplify my explanations of machine learning algorithms to a layman's level.

    The remaining 30% of my time is typically spent on data analysis and writing up results.

    This includes forecast models, predictive models of key metrics, and data mining for subgroups and trends within a given dataset.

    Each project is unique, and I try to let the project and its initial findings guide me to next steps.

    I mainly use R and Tableau for projects, though Python, Matlab, and SAS are occasionally helpful with specific packages or R&D requests.

    I can usually recycle the code, but each problem has its own assumptions and data limitations with respect to the mathematics.

    Projects can usually be simplified using tools from topology, real analysis, and graph theory, which speeds up the project and allows for the use of extant packages, rather than a need to code from scratch.

    As the only data scientist at a large company, this allows me to cover more projects and uncover more insight for our internal customers.

    When Matthew asked me to write few paragraphs about my "typical" day as data scientist, I have started thinking about my routine and daily job, but then I have stopped and realised: "I do not really have a routine" and this is the best thing about being a data scientist Every day it is different, a new challenge comes up and a new problem sits there waiting to be solved.

    I am not just talking about coding, math and statistics, but about the complexity of the business world:

    I often discuss with business people and clients to understand their real needs, I help the marketing with contents on our products, I participate in meetings about new ETL workflows and architecture design for a new product to be realised; I even found myself screening data scientist CVs.

    Being a data scientist means to be flexible, open minded and ready to solve problems and embrace complexity, but do not take me wrong: I spend more than 80% of my time cleaning data If you are just starting a career in Data Science, you have probably come around post of the type: "10 tips to master R and Python in Data Science" or "The best Deep learning library", therefore I won't give you any more technical suggestions, the only thing that I can say come from the professional data science manifesto

    and it is:
    "Data Science is about solving problems, not building models.

    " This means that if you can solve a client need with just a SQL query, do it Do not frustrate yourself over complex machine learning models:
    be simple, be helpful.

    Ajay Ohri

    is a Data Scientist at Kogentix Inc. in New Delhi. He has also written 2 books on R and one on Python.
    My typical day begins at 9 AM with a scrum call . Our methodology of project working is to divide tasks into two week goals or sprints . This is basically the agile development method for software and it is different from CRISP-DM or KDD methodologies.
    A bit of context is necessary to explain why we do so. My current role is a data scientist in a team implementing Big Data Analytics in a southeast Asian Bank.

    We have data engineers, admin/ infrastructure people, data scientists and of course customer engagement managers in the team catering to each specific need of the project.

    My current organization is an AI startup named

    Kogentix

    , not only having Big Data Services but also a Big Data Product named


    AMP
    which acts like a GUI on PySpark and tries to automate Big Data.
    AMPis quite cool and I will come to it soon.

    This leads to the focus of my startup to get as many clients as possible as well as test and implement out our Big Data Product.

    This means demonstrating success in our client engagements- one of our client was shortlisted for an award last month. Am I sounding too marketing oriented- you bet I am. The work a data scientist does is usually of a strategic consequence to the client.

    What do I do on a daily basis? It could be many things - including not just emails and meetings.

    I could be using Hive to pull data, using it to merge data (or using Impala), I could be using PySpark (Mllib) to make churn models or do k means clustering.

    I could be pulling data in an excel file to make summaries and I could be making data visualizations. Some days I prototype in R using some machine learning packages.

    I also help with testing of AMP, our Big Data Analytics product and work with that team for feature enhancement of the product (if you forgive the pun- since the product is used for feature enhancement).

    When I code Big Data, I could be using the GUI for Hadoop HUE or I could be using command line programming including batch submitting of code.

    Prior to this, when I working for India's 3rd biggest software company

    Wipro

    my role was quite opposite. Our client was India's Ministry of Finance (the arm that deals with taxes). Junior data scientists pulled data using SQL from an RDBMS (due to legacy issues), and I validated the results.
    The reports were then sent to the various clients.

    On an ad-hoc basis we also used SAS Enterprise Miner as a concept test to show time series forecasts of imports and exports for India.

    Timelines are quite slow and bureacratic when working for a federal government vis a vis working for the private sector.

    I remembered one presentation when the bureaucrat in charge was astonished we were executing machine learning and why the government did not use it earlier.

    But SAS/VA (for Dashboards),SAS Fraud Analytics (which I trained on and which was in process of implementation) and Base SAS (the analytics workhorse) are amazing software and I doubt how anything resembling SAS Domain Specific Bundles can be made soon.

    Prior to this for ten years I ran I blogged, sold ads (not very good), wrote 3 books in data science, scores of articles for Programmable Web, StatisticsViews and did some data consulting.

    I even wrote a few articles for KDnuggets. You can see my profile here

    Eric Weber

    is a Senior Data Scientist at LinkedIn, located in Sunnyvale, California.
    A day in the life at LinkedIn. Well, I think I can say there is no "typical" day. Keep that in mind as you read
    First, a little bit about me and my major responsibilities. I'm fortunate to work on our LinkedIn Learning team, which is the newest data science group in the organization. Specifically, I support Enterprise level sales for LinkedIn Learning.

    What does that mean? Think about it like this:
    we use data, models and analytics to make decisions on how to sell effectively.

    Of course, the details on how we do that are internal but you can imagine that we want to answer questions like:

    which accounts do we try to sell into? We work to understand what makes certain accounts stand out from the rest.

    Second, a key aspect of everyday is communication.

    I've written about this extensively on LinkedIn but I believe that effective communication with teammates and business partners is a defining characteristic of a great data scientist.

    On a typical day, this involves providing updates on key projects to both immediate team members, managers and senior leaders, as appropriate.

    One thing I find fascinating about this aspect of the job is the need for brevity.

    A company like LinkedIn has tons of internal communication happening so everything that goes out must be distilled into clear and concise results/talking points.

    Finally, an important part of each day is failure. I'm a big believer that if you are not failing, you are not learning. This does not mean catastrophic failure of course.

    It means that each day I work on things that expand my understanding of analytics, data science and the organization itself.

    I learn from my mistakes and watch how others do things more efficiently or in different ways from me.

    When I wake up each day, I seek failure as part of the job because it makes me better the next day.

    Analytics and the rapid pace of expansion of data science sure provides plenty of these opportunities

    Hopefully these accounts have provided you with some deeper insight into what data scientists do on a daily basis.

    I have received so much interest and so many responses from people that I will follow up this installment with others in the near future.

    #J-18808-Ljbffr

  • Gilder Search Group

    Data Scientist

    1 week ago


    Gilder Search Group California, United States

    We are looking for a · Data Scientist , based in · Latin America · to work on a long-term project for one of our clients, a software company based in Mountain View, California. · Our client has created the most popular search engine becoming one of the most widely used search ...

  • Highbury Defense Group

    Data Scientist

    4 days ago


    Highbury Defense Group California, United States

    HDG is in search of Data Scientists. In this position, team members will be tasked with crafting and deploying techniques or analytic applications to convert raw data into actionable insights, leveraging data-oriented programming languages and visualization software. As a Data Sc ...

  • Facebook App

    Data scientist

    1 week ago


    Facebook App California, United States

    Full job description · We are seeking an experienced Data Scientist to join the Messenger Data Science team.A successful candidate will be responsible for tackling high priority product challenges for Messenger and IG Direct messaging experiences. This individual will be an expe ...

  • Infinitus Systems, Inc.

    Data Scientist

    1 week ago


    Infinitus Systems, Inc. California, United States

    Data Scientist (Senior) · - AI Analytics · Infinitus is at the forefront of process automation in healthcare, dedicated to reducing costs and complexity through innovative NLP solutions. We are hiring a full-stack data scientist to support our LLM efforts and AI innovation locat ...

  • SAIC

    Data Scientist

    3 hours ago


    SAIC California, United States

    By providing the information below and checking the boxes referenced, you acknowledge and consent to SAIC's Privacy Policy to include access and use of your information for the purposes of sharing your information for possible employment recruitment effects by SAIC and it's third ...

  • Petco Animal Supplies, Inc.

    Data Scientist

    1 week ago


    Petco Animal Supplies, Inc. California, United States

    Data Scientist - Fraud Detection page is loaded · Data Scientist - Fraud Detection · Apply · locations · Remote - California · time type · Full time · posted on · Posted Yesterday · job requisition id · R237263 · Create a healthier, brighter future for pets, pet parent ...

  • LA County Library

    data scientist

    4 days ago


    LA County Library California, United States

    EXAM NUMBER: · b1763A · TYPE OF RECRUITMENT: · Open Competitive Job Opportunity · FIRST DAY OF FILING: MAY 10, 2024 at 8:00 A.M. (PT) · This examination will remain open until the needs of the service are met and is subject to closure without prior notice. · DEFINITION: · Un ...

  • Greystones Group

    Data Scientist

    1 week ago


    Greystones Group California, United States

    Greystones Group is a fast-growing woman-owned small business supporting the Warfighter with best-in-class artificial intelligence and big data analytics capabilities. Our capabilities include mission planning, operational support, leader development and education, cybersecurity, ...

  • Cedars-Sinai

    Data Scientist

    2 weeks ago


    Cedars-Sinai California, United States

    Job Description · Grow your career at Cedars-Sinai · The Enterprise Information Services (EIS) team at Cedars-Sinai understands that true clinical transformation and the optimization of a clinical information systems implementation is fueled through the alignment of the right pe ...

  • PTP

    Data Scientist

    4 days ago


    PTP California, United States

    PTP is a fast-growing system integrator that offers strategic Customer Experience (CX) solutions to our clients. We are looking for a · Data Scientist to help us design and deliver CX solutions that provide our clients with a beautiful customer journey that achieves results. At ...

  • Decarbonize

    Data Scientist

    1 week ago


    Decarbonize California, United States

    About the Role · Yard Stick is looking for a Data Scientist to help us fight climate change with soil. This role will coordinate with our hardware team to improve data collection from our novel spectral soil probe for building machine learning models, will partner with our soil s ...

  • City of New York

    Data Scientist

    1 week ago


    City of New York California, United States

    Transportation Planning and Management (TPM) is responsible for the safe, efficient, and environmentally responsible movement of people and goods on the City's streets, supporting the larger goals of economic and social vitality for people living, working, and doing business in N ...

  • Fractal, Inc.

    Data Scientist

    1 week ago


    Fractal, Inc. California, United States

    It has come to our notice that Fractal Analytics' name and logo are being misused by certain unscrupulous persons masquerading as Fractal's authorized representatives to approach job seekers to part with sensitive personal information and/or money in exchange of promise of lucrat ...

  • Atlassian

    Data Scientist

    2 days ago


    Atlassian California, United States

    Analytics & Data Science | San Francisco, United States | Austin, United States | Mountain View, United States or Remote | Remote, Americas | Full-Time · Atlassians have flexibility in where they work – whether in an office, from home, or a combination of the two. That way, Atlas ...

  • Cypress Human Capital Management, LLC

    Data Scientist

    6 days ago


    Cypress Human Capital Management, LLC California, United States

    This is an exciting opportunity to join a growing global company in the cloud-based software industry. This is a hybrid position. We are looking for a talented, enthusiastic and dedicated person to support the Fraud Risk Strategy team. The incumbent will be responsible for suppor ...


  • LA County Library California, United States

    EXAM NUMBER: · PH1764B · TYPE OF RECRUITMENT: · OPEN COMPETITIVE JOB OPPORTUNITY · FILING DATE: · Thursday, April 4, 2024 at 8:30 a.m. , · Pacific Time (PT) · This examination will remain open until the needs of the service are met and is subject to closure without prior notic ...


  • Data Masked California, United States

    Join to apply for the · Data Scientist, Decisions - Livery · role at · Lyft · Join to apply for the · Data Scientist, Decisions - Livery · role at · Lyft · Save this job with your existing LinkedIn profile, or create a new one. · Save this job with your existing LinkedIn p ...


  • Data Masked California, United States

    At Lyft, our mission is to improve people's lives with the world's best transportation. To do this, we start with our own community by creating an open, inclusive, and diverse organization. · Data and analytics are at the heart of Lyft's products and decision-making. As a member ...


  • Trov California, United States

    Data Science @ Pave · Data is truly at the heart of what we do here at Pave. So, it's no surprise that Pave Data Scientists are integral collaborators and strategic owners with company-wide impact. Our team members build data infrastructure, create and deploy ML models, define an ...


  • Zscaler California, United States

    Role Overview: · We are seeking an experienced Data Scientist with a deep understanding of AI/ML tools and techniques. The ideal candidate will have an extensive track record in applying data science methodologies to derive actionable insights from complex datasets, helping to e ...