L4 / IC3 · 3–5 years
Machine Learning Engineer interview prep — what to expect
Machine Learning Engineer interviews sit between Data Scientist and Software Engineer. The coding bar is closer to SWE — algorithmic questions on top of pandas / numpy data manipulation — and the system design round tests how you'd build the infrastructure around a model, not just the model itself.
At most major tech companies the loop is: recruiter screen, coding (often LeetCode-medium plus an ML implementation question), ML system design at moderate scale, an applied-ML depth round on classical algorithms or deep learning fundamentals, and behavioural. AI labs (Anthropic, OpenAI, Scale, Mistral) lean heavier on Python and from-scratch implementations; recommendation-heavy product companies (Pinterest, TikTok, Spotify) lean on system design for ranking / retrieval; FAANG keeps closer to the SWE coding bar.
The L4 bar is owning an end-to-end model project: framing, training, deploying, monitoring.
Personalised version
This guide covers general expectations for ML Engineer interviews. For a free report tailored to your specific job description — with predicted questions, comp benchmark, and experience-gap analysis — paste the JD into the free scan.
Run a free scan on your JD →What you'll be expected to do
- Train and deploy ML models against a defined product or business problem
- Build training and evaluation pipelines, often in Python + a framework like PyTorch / TensorFlow / JAX
- Own the data, features, and labels for your model — partnering with DE or DS upstream
- Set up monitoring, retraining cadence, and shadow / canary deployments
- Write production-grade Python: tests, code review, CI/CD
- Partner with DS, DE, and product engineering on the cross-functional surface around your model
Typical interview process
Most companies follow a similar shape for ML Engineer interviews. Total calendar time: 4–6 weeks from recruiter screen to offer.
Sample questions you should be ready for
Representative of what companies ask at this level — not a complete list. For predicted questions tied to a specific job posting, run the free scan above.
- “Implement k-means from scratch in Python — no sklearn. Walk through how you'd handle empty clusters and initialisation.”
- “Given a 10GB CSV of training data that doesn't fit in memory, implement a PyTorch DataLoader that streams from disk efficiently. Walk through how you'd handle class imbalance during sampling.”
- “Walk me through how you'd debug a model whose offline AUC is 0.85 but online performance is closer to random.”
- “Design the recommendation system for our home feed. Cover training data, features, serving latency, and what you'd monitor in production.”
- “Design a fraud-detection system that scores transactions in under 50ms. Walk through model choice, feature pipeline, and retraining cadence.”
- “Design a feature store for a 50-engineer ML team. What's the read / write split, and how do you handle online / offline parity?”
- “Tell me about an ML project you shipped to production. What broke first?”
- “Describe a time your model performed well offline but worse online. How did you diagnose and fix it?”
- “Walk me through a disagreement with a data scientist or product partner on a model choice or feature.”
Compensation benchmark
Median compensation for ML Engineer at major US tech companies, headline numbers in USD. London / Berlin / Singapore typically pay 30–50% less in base terms; equity ratios vary by company stage.
FAANG L4 ML Engineer total comp at 50th percentile is $260–340k. Comp tracks L4 SWE closely with occasional equity premium at AI-first companies (Anthropic, OpenAI, Mistral, Scale) — often 30–60% above this band at the staff / principal end.
How to prep — five tactical tips
Lead behavioural answers with the STAR method — Situation, Task, Action, Result. The tactical tips below build on that structure for this specific role.
- Drill 60+ LeetCode mediums plus 20+ pandas / numpy practice problems. The coding round at MLE bar is closer to SWE than to DS
- Practise 3–4 canonical ML system design problems cold: recommendation, fraud detection, ranking, search. Pattern-match the rest from there
- Be ready to implement at least one ML algorithm from scratch — k-means, gradient descent, simple neural network forward+backward pass. AI labs almost always ask this
- Read Chip Huyen's 'Designing Machine Learning Systems' — the canonical reference for the ML system design round
- Have 5–6 STAR stories with production-deployment specifics: model AUC, latency budget, retraining cadence, post-launch issues you debugged
Where ML Engineer candidates fail
A few common mistakes that get ML Engineer candidates rejected even when they're otherwise strong. Worth spotting in a mock interview before they show up in a real one.
Designing an ML system around the model and never mentioning the training data pipeline or labels.
Why it fails
MLE system design rounds grade on whether you understand that the model is maybe 10% of the system. The rest is data ingestion, labelling, training pipelines, serving infrastructure, monitoring. Candidates who go straight to "I'd use a gradient boosted tree with these features" without saying where the data comes from or how labels are generated signal "researcher who hasn't shipped to prod."
Fix
Open every ML system design answer by walking through the data first: what's the source, how often does it refresh, how are labels generated (explicit feedback, implicit, human-labelled), what's the training pipeline cadence. Then move to model choice. Spend at least the first 10 minutes on the data and pipeline.
Solving the coding question correctly but not narrating the ML-specific reasoning around it.
Why it fails
Coding rounds for MLE grade on both algorithmic correctness and ML judgment. Implementing k-means correctly without mentioning empty-cluster handling, initialisation sensitivity, or how you'd choose k tells the interviewer you've memorised the algorithm but haven't run it on real data. The signal is the conversation around the code, not just the code.
Fix
When you implement an ML algorithm, narrate the gotchas as you go: initialisation matters because of local minima, here's how you'd handle empty clusters, here's how you'd pick k in practice. Treat the algorithm like something you'd actually deploy, not a textbook recipe.
Discussing past projects without naming what got deployed, what the model's prod metric was, or what broke after launch.
Why it fails
MLE interviewers calibrate against IC3 production experience. Stories that stop at "the model got 0.85 AUC" miss the production reality where models drift, features go stale, training pipelines break. The pattern interviewers describe afterwards is usually "strong on the model itself, no idea if they've actually run one in prod."
Fix
For each ML project story, push it past offline metrics: what shipped, what was the online metric, what went wrong after launch (drift, data quality issues, latency spikes), what you changed because of it. Even one specific post-launch failure earns more credibility than three clean offline-AUC stories.
Recommended resources
Books, courses, and tools that come up most often in ML Engineer prep. No affiliate links.
- 01Designing Machine Learning Systems (Chip Huyen) →Canonical reference for ML system design. Read end-to-end before the system design round.
- 02Made With ML →Free practical course on MLOps and production ML. Useful for the deployment / monitoring sections of system design.
- 03Machine Learning Interviews Book (Chip Huyen) →Question bank and frameworks for ML interview prep. Free online.
- 04LeetCode (Top 150 Interview Questions) →For the algorithmic coding round. 50–80 mediums is usually enough for the MLE bar.
- 05Papers With Code — SOTA section →Skim the SOTA leaderboards for the domain you're interviewing in (CV, NLP, recsys). Helps in the depth / paper-discussion round.
Frequently asked questions
Is this guide useful if I'm a Data Scientist transitioning to MLE, or a SWE moving into ML?
Yes — the L4 / IC3 bar described here applies whether you came from DS, SWE, or research. The biggest delta for DS-to-MLE transitions is the coding bar (closer to SWE LeetCode than DS SQL). For SWE-to-MLE, the gap is usually the ML system design round — building intuition for training pipelines, feature stores, and online / offline parity. Prep for the gap that's actually your weak side; don't over-invest in what's already strong.
How long should I prep before my ML Engineer onsite?
The process takes 4–6 weeks. Add 6–8 weeks of prep — LeetCode + 3–4 ML system design canonical problems is the highest-leverage prep. Don't skip the from-scratch ML implementation practice; AI labs almost always ask this.
What's the most common mistake candidates make at the ML Engineer bar?
Treating it like a DS interview. The MLE coding bar is closer to SWE than DS, and the system design round expects production thinking (latency, monitoring, retraining) not just modelling. DS-style answers focused on offline metrics get downleveled here.
What if my interview process is different from what's listed?
Most variation is at the edges. Major tech companies (FAANG, scale-ups, mid-size SaaS) follow processes within 1–2 rounds of what's described. Smaller startups often run fewer rounds (3–4) but the bar at each round is similar; less-tech-mature companies sometimes skip system design or behavioural rounds entirely. Read the JD and ask the recruiter at the screen — they'll tell you what's coming.
How does this guide compare to running a free scan?
This guide covers the general bar at L4 / IC3. The free scan reads your specific job description and returns predicted questions for that exact role + company, a calibrated comp benchmark, and (with your CV) experience-gap analysis and an ATS resume check. PDF emailed.
Ready to prep for a real role?
Paste any ML Engineer JD or job URL, get a personalised report.
Drop a LinkedIn, Greenhouse, Lever, or Levels.fyi link — or paste the JD text directly. Predicted questions for that company, your specific experience gaps, and a compensation benchmark calibrated to the role and location. PDF emailed to you.
Run a free scan →