How to prepare for a data science interview: SQL, stats, and case questions

You have a data science interview coming up and you're staring at a prep list that covers SQL, statistics, probability, machine learning, experimentation, product sense, and behavioral questions. All for one job. That range is what makes DS interviews especially demanding. A software engineer knows they'll get coding and system design. You might get any of eight different question types in a single day.

I've been on both sides of the DS interview table. Here's what I've learned about what actually matters, what's a waste of time, and how to spend your prep hours wisely.

What the interview loop looks like

Most DS loops follow this rough shape:

Recruiter screen (30 min). Background, role fit, salary expectations.
Technical screen (45-60 min). SQL, Python/R, basic statistics.
Take-home assignment (2-8 hours). Data analysis, modeling, or experimentation design.
Onsite or virtual onsite (3-5 hours). Multiple rounds hitting SQL, ML, stats, case studies, and behavioral.
Hiring manager round (30-45 min). Team fit, communication, career trajectory.

Some companies swap the take-home for a live coding session. Startups often compress things into two longer rounds. FAANG leans harder on coding and ML system design. Analytics-focused roles at places like Airbnb and Spotify put more weight on SQL and experimentation.

The takeaway: you need to be solid across all areas. A candidate who crushes SQL but bombs the behavioral round will not get an offer. I've seen it happen many times.

SQL is non-negotiable

Every DS loop tests SQL. Every single one. Even if you spend your whole day in Python, interviewers want proof you can pull and transform data directly. The reasoning is simple. In most companies, data scientists write SQL every day.

The topics that actually come up

Window functions are the most important SQL topic for interviews, full stop. You need to be comfortable with:

ROW_NUMBER(), RANK(), and DENSE_RANK(). Know when each one is the right choice.
LAG() and LEAD() for comparing rows to adjacent rows.
Running totals and moving averages with SUM() OVER (ORDER BY ...).
PARTITION BY for grouped calculations.

If you can write a window function correctly on a whiteboard without hesitation, you're ahead of most candidates.

CTEs (Common Table Expressions) are the other big one. Interviewers love them because they reveal whether you can organize your thinking. Practice writing multi-step CTEs where each step builds on the last. If your query logic is clear enough that someone could read it top to bottom and understand what's happening, you're doing it right.

Self-joins show up in questions like "find users who made a purchase within 7 days of their first purchase" or "find employees who earn more than their manager." These trip people up because the concept of joining a table to itself feels unintuitive until you've done it a few times.

Conditional aggregation with CASE WHEN inside aggregate functions. This is bread-and-butter DS SQL and it comes up constantly.

How to practice

Work through 50 to 100 problems on LeetCode, StrataScratch, or DataLemur. Focus on medium and hard. Time yourself. Most interview SQL questions should be solvable in 15 to 20 minutes. Write your solutions in a plain text editor without autocomplete. That's what the interview will feel like.

The thing most people skip: explaining your query logic out loud while you write it. Interviewers want to hear your thought process, not just see a correct result. Practice narrating your approach as you go.

Statistics: where strong candidates separate themselves

Statistics questions test whether you understand the methods you use or just know how to call sklearn functions. Interviewers are probing your intuition, not your ability to recite formulas.

A/B testing dominates this category

This is the single most common statistics topic. If you're only going to study one thing deeply, make it experimentation. You need to be able to:

Design an experiment from scratch. Define the hypothesis, pick the randomization unit, choose the metric, calculate sample size, and set test duration. Walk through each decision and explain why.
Calculate sample size. Know the relationship between effect size, alpha, power, and sample size. Most failed experiments at tech companies fail because someone underestimated sample size. That's the kind of practical insight interviewers want to hear.
Explain p-values correctly. A p-value is the probability of seeing data at least as extreme as yours, assuming the null is true. It is not the probability that the null is true. I've rejected candidates for getting this wrong. It's that important.
Handle the pitfalls. Multiple comparisons (Bonferroni), peeking (sequential testing), network effects violating independence, novelty effects. Know these cold.

Probability and distributions

Expect questions like:

"You roll two dice. What's the probability the sum is 7?"
"What distribution would you use to model customer service calls per hour?"
"Explain the Central Limit Theorem and why it matters."

Know the normal, binomial, Poisson, and exponential distributions. For each one, know when it applies, what its parameters mean, and what shape it takes. Bayes' theorem and conditional probability come up often enough that skipping them is a gamble.

Hypothesis testing beyond A/B

Be ready to discuss one-sided vs. two-sided tests, Type I vs. Type II errors and their tradeoffs, when to use a t-test vs. chi-squared vs. Mann-Whitney, and what statistical significance actually means. That last one trips up more candidates than you'd expect.

Machine learning: think decisions, not algorithms

ML questions in DS interviews almost never ask you to implement gradient descent. They test whether you understand the tools well enough to make good decisions in practice.

Bias-variance tradeoff. Explain it with a concrete example, not a textbook definition. "My model was memorizing the training data and failing on new customers" is more convincing than reciting the mathematical decomposition.

Algorithm selection. Given a problem, which model do you reach for first, and why? The reasoning matters more than the choice. "I'd start with logistic regression because the target is binary, we have 50K rows, and the stakeholders need interpretability. If performance isn't good enough, I'd try gradient boosted trees next." That's a strong answer because it shows you're thinking about the problem, not just your favorite algorithm.

Regularization. L1 drives coefficients to zero (feature selection). L2 shrinks them (stability). Elastic Net combines both. Know when you'd pick each. This comes up constantly.

Model evaluation. Accuracy is almost never the right metric. Know precision, recall, F1, AUC-ROC, and log loss. For imbalanced classes, explain why accuracy is misleading. If a fraud detection model predicts "no fraud" 100% of the time and 99.5% of transactions are legitimate, it has 99.5% accuracy and is completely useless.

Feature engineering. Be ready to talk about handling missing values, encoding categoricals, creating interaction features, and time-based features like recency and frequency. This is where real-world DS experience shows.

Deep learning

If the role involves NLP, computer vision, or recommendations, expect deeper questions on architectures. For general DS roles, you just need to know when deep learning makes sense vs. traditional ML, and the basics of how neural networks learn.

Case questions: show your analytical instincts

Case questions in DS interviews aren't consulting cases. They test how you'd use data to solve a real business problem.

Metric investigation

"Our DAU dropped 10% last week. How would you investigate?"

This question has a right structure. Start by confirming the data is real and not a logging artifact. Segment the metric by platform, geography, user cohort, and acquisition channel. Check for external factors: holidays, competitor moves, app store changes. Form hypotheses that are falsifiable with accessible data. Quantify which segment explains the biggest chunk of the drop.

The mistake most candidates make is jumping straight to hypotheses without confirming the data or segmenting first. Interviewers notice.

Experiment design

"How would you test whether a new onboarding flow improves 30-day retention?"

Cover the randomization unit, metric definition, potential issues like selection bias or carryover effects, and how long you'd run the test. Be specific about numbers. "I'd run it for two weeks" is weak. "Based on our baseline retention of 35% and a minimum detectable effect of 3 percentage points, I'd need roughly 4,000 users per arm, which at our current signup rate means about 10 days" is strong.

Metric definition

"How would you measure the success of Spotify's Discover Weekly?"

These test product sense. Define primary and secondary metrics, explain tradeoffs between them, and flag potential gaming or unintended consequences. The best answers show you've thought about what could go wrong, not just what to measure.

Practice case questions out loud. The communication matters as much as the analysis. If your thinking is brilliant but you can't articulate it under pressure, the interviewer won't know.

The behavioral round is not a formality

Many data scientists blow off behavioral prep. That's a mistake. Hiring managers use this round to evaluate whether you can communicate findings to non-technical stakeholders, work through ambiguity, and navigate team dynamics.

Prepare specific stories for:

"Tell me about explaining a complex analysis to a non-technical audience."
"Describe a project where your initial approach didn't work."
"How do you prioritize when multiple stakeholders want different analyses?"
"Tell me about a time your analysis led to a decision you disagreed with."

For each, use the STAR method. DS behavioral answers are strongest when they include concrete details. Mention the metric. Name the model. Quantify the business impact. "I built a churn prediction model that identified at-risk accounts 30 days earlier, which let the retention team save $2M in annual revenue" beats "I built a machine learning model that helped the business."

The communication piece is the real differentiator between mid-level and senior DS candidates. If you catch yourself defaulting to jargon, rewrite the story in plain language. That's the version you should practice.

A four-week study plan

Week 1: SQL and statistics. 3-5 SQL problems daily. Review probability, distributions, and hypothesis testing. By the end of the week, you should be able to design an A/B test end to end on a whiteboard.

Week 2: ML and case questions. Algorithm selection, model evaluation, bias-variance. Practice 2-3 case questions. Work through a take-home dataset to sharpen your end-to-end workflow.

Week 3: Behavioral prep and mock interviews. Write STAR stories for 8-10 questions. Do at least two full mock interviews covering all question types. Four-Leaf's mock interview tool is good for building fluency with the verbal delivery piece, which is the part most people underestimate.

Week 4: Weak spots and simulations. Go back to whatever you struggled with in weeks 1-3. Run 2-3 full-length practice days to build stamina. A five-hour interview loop is physically draining, and you want that to be familiar, not a surprise.

The candidates who get offers are the ones who prepare across all dimensions, not just the ones they enjoy. Nobody loves grinding SQL problems. Do it anyway.

If you want structured practice that covers behavioral, technical, and case rounds in one place, Four-Leaf's data science interview prep generates role-specific questions and scores your answers across all the dimensions above.