L4 / IC3 · 3–5 years
Data Scientist interview prep — what to expect
Data Scientist interviews at most tech companies follow a fairly predictable arc: a recruiter screen, one or two technical coding rounds, an A/B test or product-sense case, a stats / modelling round, and a behavioural with the hiring manager.
What changes most between companies is the technical screens. Product-DS-heavy loops at Airbnb, Spotify, or Meta product lean on SQL plus a bit of Python. Modelling-heavy or AI-company loops (Anthropic, Scale, OpenAI applied) lean on Python and pandas, often with an ML implementation question. Google DS sometimes runs an algorithmic coding round at the SWE bar. Worth confirming the round breakdown with your recruiter before you assume it's all SQL.
The bar is technical fluency plus product judgment: can you turn an ambiguous business question into something measurable, and can you tell the story of a result to someone non-technical. L4 / IC3 candidates are graded on owning a single analysis end-to-end with one or two stakeholders.
Personalised version
This guide covers general expectations for Data Scientist interviews. For a free report tailored to your specific job description — with predicted questions, comp benchmark, and experience-gap analysis — paste the JD into the free scan.
Run a free scan on your JD →What you'll be expected to do
- Own an end-to-end analysis: define the question, pull the data, model it, communicate the result
- Design and ship A/B tests for product or growth teams; size them, monitor them, write up the read-out
- Build dashboards and recurring metrics in SQL + your team's BI tool
- Partner with PM and engineering on metric definitions and instrumentation
- Apply basic modelling (regression, classification, clustering) for prediction or segmentation
- Translate analyses into recommendations product or business stakeholders can act on
Typical interview process
Most companies follow a similar shape for Data Scientist interviews. Total calendar time: 4–6 weeks from recruiter screen to offer.
Sample questions you should be ready for
Representative of what companies ask at this level — not a complete list. For predicted questions tied to a specific job posting, run the free scan above.
- “Given a `sessions` table with user_id and event_timestamp, write a query that returns the 7-day retention curve from signup.”
- “Walk me through how you'd detect outliers in a metric you're tracking. What rules of thumb do you use, and how do you decide whether to remove them?”
- “You ran a t-test and got p = 0.04. What does that actually mean, and what would you check before declaring the test a win?”
- “Our DAU dropped 8% week-over-week. Walk me through how you'd diagnose it.”
- “Design an A/B test to evaluate a new onboarding flow. Pick the metric, name the unit of randomisation, and tell me how long you'd run it.”
- “We're considering launching a referral programme. What metrics would you instrument before launch, and how would you know if it's working in week one?”
- “Tell me about an analysis you ran where the result surprised you. What did you do with it?”
- “Describe a time a stakeholder disagreed with your conclusion. How did you handle it?”
- “Walk me through a project where the data was messy or incomplete. How did you decide what to trust?”
Compensation benchmark
Median compensation for Data Scientist at major US tech companies, headline numbers in USD. London / Berlin / Singapore typically pay 30–50% less in base terms; equity ratios vary by company stage.
FAANG L4 Data Scientist total comp at 50th percentile is $240–310k. Meta E4 DS and Google L4 DS land at the top of this band; Stripe / Airbnb / Spotify a step below. London DS base ~£80–105k. AI-first companies (Anthropic, OpenAI, Scale) often pay 20–40% above this with heavier equity weighting.
How to prep — five tactical tips
Lead behavioural answers with the STAR method — Situation, Task, Action, Result. The tactical tips below build on that structure for this specific role.
- Drill SQL cold — joins, window functions, CTEs, date arithmetic. 30–50 problems is enough if you've used SQL recently; more if you haven't. SQL screens are the most common rejection point at this level
- Brush up on pandas / numpy data manipulation in Python. Many loops have a Python or combined SQL+Python round, and candidates who only prep SQL get caught out. Practise group-by, merge, pivot, basic plotting
- Practise the A/B test framework cold: metric → unit of randomisation → sample size → duration → guardrails → readout. Have a template you can apply to any prompt
- Have 5–6 STAR stories ready, each with quantified impact: lift in conversion, model AUC, dollar value of the decision driven
- Read 'Trustworthy Online Controlled Experiments' (Kohavi) — the canonical reference for the A/B test round
- Be ready for one open-ended product question per loop ("DAU is dropping, diagnose it") — practise the segment-first, hypothesis-second structure
Where Data Scientist candidates fail
A few common mistakes that get Data Scientist candidates rejected even when they're otherwise strong. Worth spotting in a mock interview before they show up in a real one.
Hearing a question and immediately reaching for a model ("I'd train a random forest") without first looking at what the data actually looks like.
Why it fails
At L4 interviewers want to see you check the data before picking a method: distributions, missing values, join keys, time ranges. Jumping straight to a model signals "I know the tools but not the practice." The candidates who pass spend the first few minutes on data quality questions, not algorithm choice.
Fix
Open every modelling answer by naming 3 things you'd check in the data first. Something like "I'd start by checking the join logic between user and session tables, looking at the distribution of the target variable, and confirming there's no time-period bias." Then pick a method.
Designing an A/B test without naming the metric, sample size, or how you'd handle guardrails.
Why it fails
A/B test rounds at L4 grade on whether you've actually shipped experiments at scale. Vague answers ("we'd run an A/B test and check the results") miss the rigor that distinguishes a real DS from a product analyst. The interviewer is waiting for primary metric + 1–2 guardrails + sample-size estimate + duration.
Fix
Structure A/B test answers as a checklist: primary metric and why, unit of randomisation (user vs session vs request), 1–2 guardrails to prevent gaming, rough sample-size estimate for the MDE you care about, and how long you'd let it run. Even rough numbers ("~50k users per arm, two weeks") land much harder than no numbers.
Describing past projects in terms of effort and ambition, with no quantified impact attached.
Why it fails
DS interviews calibrate against IC3 scope, and they need numbers to do it. "I built a churn model that helped the retention team" tells the interviewer nothing. "I built a churn model with 0.78 AUC that the growth team used to re-segment a $20M ARR cohort and lift retention by 2.3 points" lets them peg you immediately.
Fix
For your top 4–5 stories, attach three numbers each: scale (rows / users / segments), model quality (AUC, precision, MAE — whatever's relevant), and business impact (revenue, retention, conversion delta). Rough numbers beat no numbers. Ask your eng or PM partner before the loop if you don't remember the exact figures.
Recommended resources
Books, courses, and tools that come up most often in Data Scientist prep. No affiliate links.
- 01Trustworthy Online Controlled Experiments →The canonical reference for A/B test design. Read chapters 1–7 before the A/B test round.
- 02DataLemur →SQL practice problems pulled from real DS interview loops at FAANG and major tech.
- 03StrataScratch →More SQL and Python practice with real interview questions categorised by company.
- 04An Introduction to Statistical Learning →Free Stanford textbook. Chapters 3, 4, 6 cover the stats / modelling round at L4 depth.
- 05Evan Miller's A/B test calculator →Sample-size estimation calculator. Worth knowing the math behind it before the experiment round.
Frequently asked questions
Is this guide useful if I'm transitioning from another field (analyst, engineer, PhD)?
Yes — the L4 / IC3 bar described here applies whether you came up through analytics, software engineering, or academic research. The interview tests SQL fluency, A/B test rigor, and product judgment — credentials don't substitute for any of those. The biggest delta for transition candidates is having 5+ STAR stories that map your past work to DS-shaped outcomes (analysis → recommendation → measured impact). A PhD without industry impact stories often calibrates lower than a strong analyst with shipped experiments.
How long should I prep before my Data Scientist onsite?
The process takes 4–6 weeks. Add 4–6 weeks of prep if you're rusty on SQL or Python; 2–3 weeks if you're using both daily. The A/B test framework, SQL drills, and pandas refresher are the highest-leverage prep. Don't over-invest in deep ML theory at L4 — basic concepts are enough.
What's the most common mistake candidates make at the Data Scientist bar?
Prepping only for SQL when the loop has a Python or combined round. Many candidates with strong analysis skills get caught off-guard when the technical screen asks for pandas data manipulation or a simple algorithm in Python. Confirm the round breakdown with your recruiter and prep both.
What if my interview process is different from what's listed?
Most variation is at the edges. Major tech companies (FAANG, scale-ups, mid-size SaaS) follow processes within 1–2 rounds of what's described. Smaller startups often run fewer rounds (3–4) but the bar at each round is similar; less-tech-mature companies sometimes skip system design or behavioural rounds entirely. Read the JD and ask the recruiter at the screen — they'll tell you what's coming.
How does this guide compare to running a free scan?
This guide covers the general bar at L4 / IC3. The free scan reads your specific job description and returns predicted questions for that exact role + company, a calibrated comp benchmark, and (with your CV) experience-gap analysis and an ATS resume check. PDF emailed.
Ready to prep for a real role?
Paste any Data Scientist JD or job URL, get a personalised report.
Drop a LinkedIn, Greenhouse, Lever, or Levels.fyi link — or paste the JD text directly. Predicted questions for that company, your specific experience gaps, and a compensation benchmark calibrated to the role and location. PDF emailed to you.
Run a free scan →