Window Functions for Data Interviews — 8 Patterns to Know
ROW_NUMBER, RANK, LAG, LEAD, and the four less-obvious patterns that separate good candidates from great ones. Real examples included.
If there's one technique that separates candidates who pass senior data interviews from candidates who don't, it's fluency with window functions. Not because window functions are intellectually deep — they aren't. Because they show up in roughly 60% of intermediate-and-above SQL questions, and candidates who hesitate on them telegraph their seniority level in the first 30 seconds.
This is a tour of the 8 window function patterns that handle the vast majority of real interview questions. Each pattern has a one-line "when to reach for it" rule and a clean code example. Master these and you'll recognize the right tool the moment the question is asked.
The Mental Model First
Window functions compute a value for each row, using a "window" of related rows defined by PARTITION BY (which rows are in the window) and ORDER BY (in what order). Crucially, unlike GROUP BY, they don't collapse rows — every input row gets an output row.
The two questions to ask yourself before writing any window function:
- What's the partition? What set of rows is "related" for this calculation?
- What's the order? Within the partition, what defines "first" or "previous"?
Get those two right and the rest is syntax.
Pattern 1 — ROW_NUMBER: Top-N Per Group
When to reach for it: "For each group, find the top N records by some metric."
This is the most-asked window pattern in interviews. Period.
WITH ranked AS (
SELECT
customer_id,
order_id,
order_date,
order_total,
ROW_NUMBER() OVER (
PARTITION BY customer_id
ORDER BY order_date DESC
) AS rn
FROM orders
)
SELECT customer_id, order_id, order_date, order_total
FROM ranked
WHERE rn <= 3;
The senior signal: knowing to use ROW_NUMBER (unique ranks, ties broken by ORDER BY tiebreakers) vs. RANK (ties share rank, gaps after) vs. DENSE_RANK (ties share rank, no gaps). When ties are possible, always add an explicit tiebreaker — ORDER BY order_date DESC, order_id DESC — so the result is deterministic.
Pattern 2 — RANK / DENSE_RANK: Handling Ties Explicitly
When to reach for it: "Find the top 3 — but if there's a tie at #3, include all of them."
WITH ranked AS (
SELECT
product_id,
product_name,
units_sold,
RANK() OVER (ORDER BY units_sold DESC) AS rnk
FROM products
)
SELECT product_id, product_name, units_sold
FROM ranked
WHERE rnk <= 3;
With RANK, if three products tie for #1, all three get rank 1 and the next product gets rank 4 (gap). With DENSE_RANK, the next product gets rank 2 (no gap). Interviewers love asking you to articulate this distinction.
Pattern 3 — LAG / LEAD: Comparing to Previous or Next Row
When to reach for it: "For each row, compute the difference from the previous row."
SELECT
user_id,
login_date,
LAG(login_date) OVER (
PARTITION BY user_id
ORDER BY login_date
) AS prev_login,
login_date - LAG(login_date) OVER (
PARTITION BY user_id
ORDER BY login_date
) AS days_since_prev
FROM logins;
LAG(col, n) returns the value n rows back (default 1). LEAD is the same but forward. Used for: time-since-last-event, churn analysis ("user X went silent for 60 days"), and step-to-step funnel timing.
Two interview gotchas: the first row in each partition has NULL for LAG (no previous row exists). And date subtraction returns an INTERVAL in Postgres — cast or EXTRACT(day FROM ...) if you need a number.
Pattern 4 — SUM / AVG OVER: Running Totals
When to reach for it: "Compute a cumulative metric over time."
SELECT
order_date,
daily_revenue,
SUM(daily_revenue) OVER (
ORDER BY order_date
) AS cumulative_revenue
FROM daily_revenue_summary;
By default (no explicit frame), SUM() OVER (ORDER BY ...) accumulates from the start of the partition to the current row. That's a running total.
The interview move: knowing this is the default frame, and being able to articulate it as ROWS UNBOUNDED PRECEDING AND CURRENT ROW.
Pattern 5 — Frame Specifications: Rolling Windows
When to reach for it: "Compute a 7-day rolling average."
SELECT
date_day,
daily_users,
AVG(daily_users) OVER (
ORDER BY date_day
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_7d_avg
FROM daily_active_users;
The frame clause (ROWS BETWEEN N PRECEDING AND CURRENT ROW) is what turns a running total into a rolling window. This is one of the highest-bar window patterns — interviewers use it to separate intermediate from advanced candidates.
Two distinctions:
ROWS BETWEEN N PRECEDING AND CURRENT ROW= positional. Counts rows back, regardless of dates.RANGE BETWEEN INTERVAL '6 days' PRECEDING AND CURRENT ROW= value-based. Counts back by date value.
The positional version (ROWS) is usually what you want if your data has one row per day and no gaps. If days can be missing, RANGE is safer.
Pattern 6 — NTILE: Bucketing Into Quantiles
When to reach for it: "Divide users into deciles by lifetime value."
SELECT
user_id,
lifetime_value,
NTILE(10) OVER (ORDER BY lifetime_value) AS ltv_decile
FROM users;
NTILE(N) divides the ordered partition into N approximately-equal buckets and assigns each row a bucket number from 1 to N. Used for: quintile/decile analysis, segmentation, and cohort splits by behavior.
Senior detail: when row count isn't perfectly divisible by N, NTILE adds the extras to the lower-numbered buckets. So bucket 1 may have 11 rows while bucket 10 has 10. This is rarely a problem, but interviewers occasionally probe it.
Pattern 7 — FIRST_VALUE / LAST_VALUE: Anchor Analysis
When to reach for it: "For each user, what was their first purchase amount?"
SELECT
user_id,
order_id,
order_total,
FIRST_VALUE(order_total) OVER (
PARTITION BY user_id
ORDER BY order_date
) AS first_order_total
FROM orders;
Useful for: comparing every event to the first event in a group, normalizing by initial value, computing growth/decay relative to a starting point.
LAST_VALUE is the same but for the end. Gotcha: the default frame for LAST_VALUE is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which means the "last value" is actually the current row's value. To get the true last value, you need ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. This trips up most candidates. Senior signal: catch it unprompted.
Pattern 8 — Cumulative COUNT/DISTINCT: Growth Up To A Point
When to reach for it: "How many total signups had we seen by each date?"
SELECT
signup_date,
daily_signups,
SUM(daily_signups) OVER (
ORDER BY signup_date
ROWS UNBOUNDED PRECEDING
) AS total_signups_to_date
FROM daily_signups_summary;
Same shape as Pattern 4, just emphasizing the "everything before this point" semantic. Used heavily in growth analytics — total customers acquired by date, cumulative revenue, etc.
The senior nuance: this is a cheap way to render a cumulative chart from already-aggregated daily data. If you instead need to count distinct entities cumulatively (e.g., total distinct users to date), window functions can't do COUNT(DISTINCT ...) OVER (...). You need a different pattern — a self-join or a subquery.
The Mistakes Candidates Make
1. Forgetting the partition. Writing ROW_NUMBER() OVER (ORDER BY x DESC) when you meant per-group — the window function will rank globally, not per group. Always ask yourself: "is this per-something, or across everything?"
2. Wrong default frame on LAST_VALUE. Already mentioned. If you write LAST_VALUE(...) without a frame clause, you'll get the current row's value, not the actual last value. Always add ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
3. Treating window functions like aggregations. GROUP BY collapses rows. Window functions don't. If your output has too many rows, you might have used a window function where a GROUP BY was right — or vice versa.
4. Filtering on the window function in the same query. You can't reference ROW_NUMBER() OVER (...) in a WHERE clause in the same SELECT — windows are computed after WHERE. Always wrap the windowed query in a CTE or subquery, then filter on it.
How To Get Fast
The path to fluency is repetition on the patterns. Pick 5 problems for each of the 8 patterns above (40 total). Solve each one, then immediately solve a variation — change the partition, change the order, switch from ROW_NUMBER to RANK, add a frame clause.
By problem 20 you'll start to feel the patterns as muscle memory. By 40, you'll see a question and the right pattern will surface in the first 30 seconds. That's the bar for senior interviews — you don't think about which window function to use, you reach for the right one.
The candidates who land FAANG and AI lab offers in 2026 aren't necessarily smarter about windows. They've just done the reps.
Learn advanced SQL
Cohort analysis, LTV/CAC, retention curves, A/B testing — FAANG-level analytical SQL.
Learn advanced SQL