L4 / IC3 · 3–5 years

Data Scientist interview prep — what to expect

5 rounds4–6 weeks9 sample questions$150–180k base

Data Scientist interviews at most tech companies follow a fairly predictable arc: a recruiter screen, one or two technical coding rounds, an A/B test or product-sense case, a stats / modelling round, and a behavioural with the hiring manager.

What changes most between companies is the technical screens. Product-DS-heavy loops at Airbnb, Spotify, or Meta product lean on SQL plus a bit of Python. Modelling-heavy or AI-company loops (Anthropic, Scale, OpenAI applied) lean on Python and pandas, often with an ML implementation question. Google DS sometimes runs an algorithmic coding round at the SWE bar. Worth confirming the round breakdown with your recruiter before you assume it's all SQL.

The bar is technical fluency plus product judgment: can you turn an ambiguous business question into something measurable, and can you tell the story of a result to someone non-technical. L4 / IC3 candidates are graded on owning a single analysis end-to-end with one or two stakeholders.

Personalised version

This guide covers general expectations for Data Scientist interviews. For a free report tailored to your specific job description — with predicted questions, comp benchmark, and experience-gap analysis — paste the JD into the free scan.

Run a free scan on your JD →

What you'll be expected to do

Own an end-to-end analysis: define the question, pull the data, model it, communicate the result
Design and ship A/B tests for product or growth teams; size them, monitor them, write up the read-out
Build dashboards and recurring metrics in SQL + your team's BI tool
Partner with PM and engineering on metric definitions and instrumentation
Apply basic modelling (regression, classification, clustering) for prediction or segmentation
Translate analyses into recommendations product or business stakeholders can act on

Typical interview process

Most companies follow a similar shape for Data Scientist interviews. Total calendar time: 4–6 weeks from recruiter screen to offer.

Recruiter screen

30-min phone call

Background, role calibration, motivation, comp expectations

Technical coding screen

60–90 min

Usually combines SQL (joins, window functions, CTEs) with Python (pandas / numpy data manipulation). Some companies split into two rounds (Google, Meta DS); AI labs sometimes drop SQL entirely and add ML implementation. Confirm the split with your recruiter

A/B test / product-sense case

60-min

Design an experiment for a given product change, or diagnose a metric drop. Probing on metric choice, sample sizing, guardrails, and how you'd communicate the result

Stats / modelling case

45–60 min

Hypothesis testing, model selection reasoning, basic ML concepts (bias-variance, overfitting), how you'd model a specific business problem end-to-end

Behavioural / hiring manager

45-min

Past projects, cross-functional collaboration, handling ambiguity, communicating to non-technical stakeholders

Sample questions you should be ready for

Representative of what companies ask at this level — not a complete list. For predicted questions tied to a specific job posting, run the free scan above.

Technical / coding

“Given a `sessions` table with user_id and event_timestamp, write a query that returns the 7-day retention curve from signup.”
“Walk me through how you'd detect outliers in a metric you're tracking. What rules of thumb do you use, and how do you decide whether to remove them?”
“You ran a t-test and got p = 0.04. What does that actually mean, and what would you check before declaring the test a win?”

Product sense

“Our DAU dropped 8% week-over-week. Walk me through how you'd diagnose it.”
“Design an A/B test to evaluate a new onboarding flow. Pick the metric, name the unit of randomisation, and tell me how long you'd run it.”
“We're considering launching a referral programme. What metrics would you instrument before launch, and how would you know if it's working in week one?”

Behavioural (STAR method)

“Tell me about an analysis you ran where the result surprised you. What did you do with it?”
“Describe a time a stakeholder disagreed with your conclusion. How did you handle it?”
“Walk me through a project where the data was messy or incomplete. How did you decide what to trust?”

Compensation benchmark

Median compensation for Data Scientist at major US tech companies, headline numbers in USD. London / Berlin / Singapore typically pay 30–50% less in base terms; equity ratios vary by company stage.

Base salary$150–180k (SF/NYC)

Equity (annual vest)$60–120k/yr

Bonus10–15%

FAANG L4 Data Scientist total comp at 50th percentile is $240–310k. Meta E4 DS and Google L4 DS land at the top of this band; Stripe / Airbnb / Spotify a step below. London DS base ~£80–105k. AI-first companies (Anthropic, OpenAI, Scale) often pay 20–40% above this with heavier equity weighting.

How to prep — five tactical tips

Lead behavioural answers with the STAR method — Situation, Task, Action, Result. The tactical tips below build on that structure for this specific role.

Drill SQL cold — joins, window functions, CTEs, date arithmetic. 30–50 problems is enough if you've used SQL recently; more if you haven't. SQL screens are the most common rejection point at this level
Brush up on pandas / numpy data manipulation in Python. Many loops have a Python or combined SQL+Python round, and candidates who only prep SQL get caught out. Practise group-by, merge, pivot, basic plotting
Practise the A/B test framework cold: metric → unit of randomisation → sample size → duration → guardrails → readout. Have a template you can apply to any prompt
Have 5–6 STAR stories ready, each with quantified impact: lift in conversion, model AUC, dollar value of the decision driven
Read 'Trustworthy Online Controlled Experiments' (Kohavi) — the canonical reference for the A/B test round
Be ready for one open-ended product question per loop ("DAU is dropping, diagnose it") — practise the segment-first, hypothesis-second structure

Where Data Scientist candidates fail

A few common mistakes that get Data Scientist candidates rejected even when they're otherwise strong. Worth spotting in a mock interview before they show up in a real one.

Hearing a question and immediately reaching for a model ("I'd train a random forest") without first looking at what the data actually looks like.

Why it fails

At L4 interviewers want to see you check the data before picking a method: distributions, missing values, join keys, time ranges. Jumping straight to a model signals "I know the tools but not the practice." The candidates who pass spend the first few minutes on data quality questions, not algorithm choice.

Fix

Open every modelling answer by naming 3 things you'd check in the data first. Something like "I'd start by checking the join logic between user and session tables, looking at the distribution of the target variable, and confirming there's no time-period bias." Then pick a method.

Designing an A/B test without naming the metric, sample size, or how you'd handle guardrails.

Why it fails

A/B test rounds at L4 grade on whether you've actually shipped experiments at scale. Vague answers ("we'd run an A/B test and check the results") miss the rigor that distinguishes a real DS from a product analyst. The interviewer is waiting for primary metric + 1–2 guardrails + sample-size estimate + duration.

Fix

Structure A/B test answers as a checklist: primary metric and why, unit of randomisation (user vs session vs request), 1–2 guardrails to prevent gaming, rough sample-size estimate for the MDE you care about, and how long you'd let it run. Even rough numbers ("~50k users per arm, two weeks") land much harder than no numbers.

Describing past projects in terms of effort and ambition, with no quantified impact attached.

Why it fails

DS interviews calibrate against IC3 scope, and they need numbers to do it. "I built a churn model that helped the retention team" tells the interviewer nothing. "I built a churn model with 0.78 AUC that the growth team used to re-segment a $20M ARR cohort and lift retention by 2.3 points" lets them peg you immediately.

Fix

For your top 4–5 stories, attach three numbers each: scale (rows / users / segments), model quality (AUC, precision, MAE — whatever's relevant), and business impact (revenue, retention, conversion delta). Rough numbers beat no numbers. Ask your eng or PM partner before the loop if you don't remember the exact figures.

Recommended resources

Books, courses, and tools that come up most often in Data Scientist prep. No affiliate links.

01
Trustworthy Online Controlled Experiments →The canonical reference for A/B test design. Read chapters 1–7 before the A/B test round.
02
DataLemur →SQL practice problems pulled from real DS interview loops at FAANG and major tech.
03
StrataScratch →More SQL and Python practice with real interview questions categorised by company.
04
An Introduction to Statistical Learning →Free Stanford textbook. Chapters 3, 4, 6 cover the stats / modelling round at L4 depth.
05
Evan Miller's A/B test calculator →Sample-size estimation calculator. Worth knowing the math behind it before the experiment round.

Frequently asked questions

Is this guide useful if I'm transitioning from another field (analyst, engineer, PhD)?

Yes — the L4 / IC3 bar described here applies whether you came up through analytics, software engineering, or academic research. The interview tests SQL fluency, A/B test rigor, and product judgment — credentials don't substitute for any of those. The biggest delta for transition candidates is having 5+ STAR stories that map your past work to DS-shaped outcomes (analysis → recommendation → measured impact). A PhD without industry impact stories often calibrates lower than a strong analyst with shipped experiments.

How long should I prep before my Data Scientist onsite?

The process takes 4–6 weeks. Add 4–6 weeks of prep if you're rusty on SQL or Python; 2–3 weeks if you're using both daily. The A/B test framework, SQL drills, and pandas refresher are the highest-leverage prep. Don't over-invest in deep ML theory at L4 — basic concepts are enough.

What's the most common mistake candidates make at the Data Scientist bar?

Prepping only for SQL when the loop has a Python or combined round. Many candidates with strong analysis skills get caught off-guard when the technical screen asks for pandas data manipulation or a simple algorithm in Python. Confirm the round breakdown with your recruiter and prep both.

What if my interview process is different from what's listed?

Most variation is at the edges. Major tech companies (FAANG, scale-ups, mid-size SaaS) follow processes within 1–2 rounds of what's described. Smaller startups often run fewer rounds (3–4) but the bar at each round is similar; less-tech-mature companies sometimes skip system design or behavioural rounds entirely. Read the JD and ask the recruiter at the screen — they'll tell you what's coming.

How does this guide compare to running a free scan?

This guide covers the general bar at L4 / IC3. The free scan reads your specific job description and returns predicted questions for that exact role + company, a calibrated comp benchmark, and (with your CV) experience-gap analysis and an ATS resume check. PDF emailed.

Ready to prep for a real role?

Paste any Data Scientist JD or job URL, get a personalised report.

Drop a LinkedIn, Greenhouse, Lever, or Levels.fyi link — or paste the JD text directly. Predicted questions for that company, your specific experience gaps, and a compensation benchmark calibrated to the role and location. PDF emailed to you.

Run a free scan →