The Meta Data Analyst SQL Interview — Real Questions + Frameworks
Inside the Meta data interview loop: the four categories of SQL they actually ask, the frameworks senior analysts use to structure answers, and the patterns that get you to the next round.
Meta's data analyst loop has a reputation: the SQL is hard, the business framing is harder, and most candidates underestimate both. The good news is that it's also patterned. Once you see the four categories of question Meta actually asks, the loop stops feeling random and starts feeling like something you can prepare for.
This is the playbook a senior data scientist friend uses to coach candidates through the loop — the four buckets, the frameworks for each, and the patterns that get you to the next round.
The Meta Data Loop, In Plain Terms
Meta runs a multi-round process for data analysts and data scientists. The SQL screen and the on-site analytical interview share the same DNA. Both test three things in this order:
- Can you write correct SQL on a realistic schema?
- Can you turn an ambiguous business question into a query plan?
- Can you read the result and tell a product story?
Most candidates over-prepare for #1 and under-prepare for #2 and #3. That's the trap. Meta interviewers are looking for an analyst, not a SQL puzzle-solver, and the bar for "correct SQL" is honestly just competence. What separates offers from rejects is how you frame the problem and how you talk through your work.
The Four Question Categories
Every SQL question Meta asks falls into one of four buckets. Knowing the bucket up front lets you reach for the right pattern instead of starting from scratch.
1. Aggregation with grouping
The simplest bucket. Given a table, compute a metric grouped by some dimension. The trick is rarely the SQL — it's noticing the edge cases.
Example: "Given a posts table with user_id, created_at, and post_type, find the number of posts each user created in the last 30 days, broken down by post type."
The answer is one line of GROUP BY. The interesting part is what they ask next:
- What about users with zero posts? Do they appear in the result?
- What if
created_atis in UTC but the user is in a different timezone? - What if
post_typehas NULLs?
Always pause after writing the query and call out one edge case unprompted. Interviewers score that.
2. Window functions on user behavior
This is where the bar starts to separate juniors from mid-levels. Meta loves questions that ask "for each user, find the X" — and the answer is almost always a window function.
Example: "For each user, find the time gap between their first and second post."
WITH ranked AS (
SELECT
user_id,
created_at,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY created_at) AS post_rank
FROM posts
)
SELECT
user_id,
MAX(CASE WHEN post_rank = 2 THEN created_at END)
- MAX(CASE WHEN post_rank = 1 THEN created_at END) AS gap
FROM ranked
WHERE post_rank <= 2
GROUP BY user_id
HAVING COUNT(*) = 2;
The patterns to memorize for this bucket: ROW_NUMBER() for "the Nth event per user", LAG()/LEAD() for "time since previous event", and SUM() OVER (PARTITION BY ... ORDER BY ...) for running totals. About 60% of Meta's intermediate SQL questions reduce to one of these three.
3. Funnel and conversion
This is Meta's signature bucket. They are a product-led growth company and their interview reflects it.
Example: "Given an events table tracking viewed_signup, started_signup, completed_signup, compute the conversion rate at each step."
The answer is usually a multi-step CTE: pivot the events per user, then count how many users hit each milestone.
WITH user_funnel AS (
SELECT
user_id,
MAX(CASE WHEN event_name = 'viewed_signup' THEN 1 ELSE 0 END) AS viewed,
MAX(CASE WHEN event_name = 'started_signup' THEN 1 ELSE 0 END) AS started,
MAX(CASE WHEN event_name = 'completed_signup' THEN 1 ELSE 0 END) AS completed
FROM events
GROUP BY user_id
)
SELECT
SUM(viewed) AS viewed_count,
SUM(started) AS started_count,
SUM(completed) AS completed_count,
ROUND(SUM(started)::numeric / NULLIF(SUM(viewed), 0) * 100, 1) AS view_to_start_rate,
ROUND(SUM(completed)::numeric / NULLIF(SUM(started), 0) * 100, 1) AS start_to_complete_rate
FROM user_funnel;
Two things to call out unprompted in this bucket: the NULLIF guard (division by zero), and whether the funnel should be sequential (users must hit viewed before started) or just presence (users hit the event at any point). Sequential funnels are slightly harder and usually involve MIN(CASE WHEN ... THEN created_at END) comparisons.
4. Cohort and retention
The fourth bucket is the one most candidates panic about. It's also the most predictable once you've seen the shape.
Example: "Compute month-3 retention for each monthly cohort over the past year."
The shape is always the same:
- Define the cohort (when did this user first show up?)
- Define the retention event (when did they show up again?)
- Bucket the gap (period N after cohort start)
- Aggregate
WITH cohorts AS (
SELECT
user_id,
DATE_TRUNC('month', MIN(created_at)) AS cohort_month
FROM users
GROUP BY user_id
),
activity AS (
SELECT
c.user_id,
c.cohort_month,
DATE_TRUNC('month', e.created_at) AS active_month
FROM cohorts c
JOIN events e USING (user_id)
)
SELECT
cohort_month,
COUNT(DISTINCT CASE
WHEN active_month = cohort_month + INTERVAL '3 months'
THEN user_id
END)::numeric / NULLIF(COUNT(DISTINCT user_id), 0) AS month_3_retention
FROM activity
GROUP BY cohort_month
ORDER BY cohort_month;
If you can write this cleanly without prompting, you've already cleared the 80th percentile. Most candidates either butcher the date math or forget the DISTINCT user_id.
The Framework: What To Say Out Loud
The query is half the score. The other half is how you narrate it. Meta interviewers explicitly grade on "communication" — what they mean is whether they had to drag your reasoning out of you.
The framework that works:
-
Restate the question in business terms. Don't dive into SQL. Say what you understand the question is asking. This is your chance to catch ambiguity ("Is this restricted to active users, or all users?") and signal that you think before you type.
-
Sketch the answer shape. Before writing any SQL, describe what the output should look like. "I expect to return one row per user with columns X, Y, Z, sorted by Z descending." This is a senior signal.
-
Write a clean draft. Don't over-engineer. Don't reach for a window function when a
GROUP BYdoes it. Simpler queries score higher because they're easier to verify. -
Verify with edge cases. Out loud. "If a user has no posts, they wouldn't appear here. Do we want them in the result with a zero?" This is the move that separates mid-level from senior.
-
Interpret the result. "If I saw 30% conversion at step 2 but 60% at step 3, I'd want to dig into segment differences — is this driven by mobile vs. desktop, or by acquisition channel?" Don't stop at the number.
The Three Mistakes That Cost Offers
After watching dozens of mock loops, three failure modes show up over and over.
Silent debugging. When a query doesn't run, candidates often go quiet and start staring at it. Interviewers read this as panic. Narrate instead: "The error says ambiguous column reference — I'm aliasing the second table." This turns a stumble into a signal of composure.
Skipping the framing step. Diving straight into SQL feels efficient but reads as a lack of senior judgment. The 60 seconds you spend restating the question is the most valuable 60 seconds in the loop.
Over-engineering. A WITH block doesn't make a query better — it makes it longer to verify. Use CTEs when they genuinely improve readability or when the same subquery shows up twice. Otherwise, stay inline.
How To Prepare
The honest answer: practice on schemas that look like Meta's, not on toy databases. Meta's data is messy, multi-table, and full of business context. LeetCode-style SQL practice doesn't simulate that, which is why people grind 300 LeetCode SQL problems and still walk out of the Meta loop confused.
The patterns above are the four buckets. Master one bucket per week, force yourself to narrate every query out loud, and time-box yourself to 25 minutes — the same window the interviewer will give you. That's the prep that translates.
Try today's daily SQL task
A fresh, AI-graded SQL challenge ships every morning. Five minutes a day, real momentum.
Try today's daily SQL task