SQL Skills

Marketing Attribution & Campaign Queries — Real Interview Examples

Multi-touch attribution, channel ROI, email engagement, and campaign lift. The five marketing analytics patterns that show up in every senior data interview at a paid-acquisition company.

12 min read·SQL Skills

Marketing analytics SQL is the second most-asked domain in growth and product interviews — right behind cohort retention. And it's the domain where candidates lose the most points to subtle wrongness. Conversion rates that look fine but use the wrong denominator. Attribution queries that double-count. Channel ROI calculations that ignore organic baselines.

The good news: most marketing interview questions reduce to five core patterns. Master those and you'll handle anything from "what's our cost per lead" to "how should we re-allocate paid budget."

This guide walks through all five with real SQL, the senior moves, and the mistakes that cost offers.

The Mental Model

Every marketing analytics question is asking one of three things:

How much did this cost? (Spend, CAC, CPL)
What did it produce? (Leads, signups, revenue)
What's the ratio? (ROI, LTV/CAC, conversion rate)

The trick is that "what did it produce" almost always involves attribution — assigning credit to the marketing touchpoint that "caused" the conversion. Attribution is where the SQL gets interesting.

Pattern 1 — Channel ROI: Spend vs. Revenue

The question: "For each acquisition channel, compute cost per acquisition (CPA) and revenue per acquired user over the last 90 days."

WITH acquired AS (
  SELECT
    channel,
    user_id,
    created_at AS acquired_at
  FROM users
  WHERE created_at >= CURRENT_DATE - INTERVAL '90 days'
),
spend AS (
  SELECT
    channel,
    SUM(spend) AS total_spend
  FROM marketing_spend
  WHERE spend_date >= CURRENT_DATE - INTERVAL '90 days'
  GROUP BY channel
),
revenue AS (
  SELECT
    a.channel,
    SUM(o.amount) AS attributed_revenue
  FROM acquired a
  LEFT JOIN orders o
    ON o.user_id = a.user_id
   AND o.status = 'completed'
   AND o.order_date >= a.acquired_at
  GROUP BY a.channel
)
SELECT
  a.channel,
  COUNT(DISTINCT a.user_id) AS acquired_users,
  COALESCE(s.total_spend, 0) AS total_spend,
  COALESCE(r.attributed_revenue, 0) AS attributed_revenue,
  ROUND(COALESCE(s.total_spend, 0)::numeric / NULLIF(COUNT(DISTINCT a.user_id), 0), 2) AS cpa,
  ROUND(COALESCE(r.attributed_revenue, 0)::numeric / NULLIF(COUNT(DISTINCT a.user_id), 0), 2) AS revenue_per_user,
  ROUND(COALESCE(r.attributed_revenue, 0)::numeric / NULLIF(COALESCE(s.total_spend, 0), 0), 2) AS roas
FROM acquired a
LEFT JOIN spend s ON s.channel = a.channel
LEFT JOIN revenue r ON r.channel = a.channel
GROUP BY a.channel, s.total_spend, r.attributed_revenue
ORDER BY roas DESC NULLS LAST;

The senior signal: flagging that 90 days isn't long enough to see real LTV. "This is an early-indicator ROAS. A channel that looks bad here might still have great long-term LTV; a channel that looks good might churn fast. I'd recommend looking at 12-month LTV per acquired user before making budget reallocations."

Pattern 2 — Multi-Touch Attribution

The question: "A user saw a Google ad, then an email, then converted. Which channel gets credit?"

There's no single right answer. The interview is testing whether you know the attribution models exist and how each one assigns credit:

First-touch: All credit to first interaction
Last-touch: All credit to last interaction before conversion
Linear: Credit split equally across all touches
Time-decay: More credit to touches closer to conversion
U-shaped: 40% first, 40% last, 20% middle touches

The SQL for first-touch attribution:

WITH first_touch AS (
  SELECT
    user_id,
    channel,
    touch_time,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY touch_time) AS rn
  FROM marketing_touches
)
SELECT
  ft.channel AS first_touch_channel,
  COUNT(DISTINCT c.user_id) AS conversions,
  SUM(c.revenue) AS attributed_revenue
FROM first_touch ft
JOIN conversions c ON c.user_id = ft.user_id
WHERE ft.rn = 1
  AND ft.touch_time <= c.converted_at
GROUP BY ft.channel
ORDER BY attributed_revenue DESC;

For last-touch, swap ORDER BY touch_time to ORDER BY touch_time DESC.

The senior signal: acknowledging that no attribution model is "correct" — they're frames for thinking about influence. "First-touch over-credits awareness channels like display ads. Last-touch over-credits closer channels like branded search and retargeting. I'd report both and look at the delta — channels that diverge dramatically between the two are doing something interesting."

Pattern 3 — Email Campaign Engagement

The question: "For our last marketing email, compute open rate, click-through rate, and unsubscribe rate. Compare to the previous email."

WITH email_metrics AS (
  SELECT
    campaign_id,
    sent_at,
    recipients,
    opens,
    clicks,
    unsubscribes,
    bounces,
    ROUND(opens::numeric        / NULLIF(recipients - bounces, 0) * 100, 1) AS open_rate,
    ROUND(clicks::numeric       / NULLIF(opens, 0)                * 100, 1) AS ctr,
    ROUND(unsubscribes::numeric / NULLIF(recipients - bounces, 0) * 100, 2) AS unsubscribe_rate
  FROM email_sends
  ORDER BY sent_at DESC
  LIMIT 2
)
SELECT
  campaign_id,
  sent_at,
  recipients,
  open_rate,
  ctr,
  unsubscribe_rate
FROM email_metrics
ORDER BY sent_at DESC;

The senior signal:

Open rate denominator is recipients - bounces, not recipients. Counting bounced emails as "didn't open" inflates the denominator.
CTR has two definitions: clicks/opens (engaged users who clicked) vs. clicks/recipients (overall click rate). Be explicit about which one you're computing.
Unsubscribe rate has a hard floor — anything above 0.5% on a normal campaign is a real problem, not noise.

Pattern 4 — Campaign Lift (Incrementality)

The question: "Our paid social campaign drove 8,000 signups last month. Were they incremental, or would those users have signed up anyway?"

This is the question senior marketing analysts dread because the answer is hard. You can't easily measure counterfactual users. Two approaches:

Approach A — Holdout test (clean but slow):

-- Compare exposed vs. holdout group conversion rates
WITH exposed AS (
  SELECT user_id, COUNT(*) > 0 AS exposed
  FROM ad_exposures
  WHERE campaign_id = 'paid_social_q4'
  GROUP BY user_id
),
holdout AS (
  SELECT user_id
  FROM ad_holdout_assignments
  WHERE campaign_id = 'paid_social_q4'
),
combined AS (
  SELECT
    COALESCE(e.user_id, h.user_id) AS user_id,
    CASE WHEN e.exposed THEN 'exposed' ELSE 'holdout' END AS group
  FROM exposed e
  FULL OUTER JOIN holdout h ON h.user_id = e.user_id
)
SELECT
  c.group,
  COUNT(DISTINCT c.user_id) AS users,
  COUNT(DISTINCT CASE WHEN u.signed_up THEN c.user_id END) AS signups,
  ROUND(
    COUNT(DISTINCT CASE WHEN u.signed_up THEN c.user_id END)::numeric
    / NULLIF(COUNT(DISTINCT c.user_id), 0)
    * 100,
    2
  ) AS signup_rate
FROM combined c
JOIN user_outcomes u ON u.user_id = c.user_id
GROUP BY c.group;

The lift = exposed signup rate − holdout signup rate. That difference is the incremental impact.

Approach B — Pre/post comparison (cheap but contaminated):

Compare signup volume before and during the campaign. Doesn't isolate the campaign from other factors (seasonality, competitor moves, organic growth) — but it's what you have if no holdout exists.

The senior signal: acknowledging that pre/post is what marketers commonly do and what doesn't actually work. "A 30% lift in pre/post is consistent with anywhere from 0% to 100% real campaign impact. The only way to be confident is a holdout test, which you have to design before the campaign launches."

Pattern 5 — Cohort-Based Channel Quality

The question: "Which acquisition channel produces users with the highest 6-month retention?"

This is the channel quality question — not "which channel drives the most users" but "which channel drives the best users."

WITH cohorts AS (
  SELECT
    user_id,
    channel,
    DATE_TRUNC('month', created_at) AS signup_month
  FROM users
  WHERE created_at BETWEEN CURRENT_DATE - INTERVAL '12 months'
                       AND CURRENT_DATE - INTERVAL '6 months'
),
activity AS (
  SELECT
    c.user_id,
    c.channel,
    c.signup_month,
    BOOL_OR(e.event_time BETWEEN c.signup_month + INTERVAL '6 months'
                             AND c.signup_month + INTERVAL '7 months') AS active_m6
  FROM cohorts c
  LEFT JOIN events e ON e.user_id = c.user_id
  GROUP BY c.user_id, c.channel, c.signup_month
)
SELECT
  channel,
  COUNT(DISTINCT user_id) AS cohort_size,
  COUNT(DISTINCT CASE WHEN active_m6 THEN user_id END)::numeric
    / NULLIF(COUNT(DISTINCT user_id), 0) AS m6_retention
FROM activity
GROUP BY channel
ORDER BY m6_retention DESC;

The senior signal: combining this with Pattern 1 (CPA). A channel with high CPA but high retention may have better unit economics than a channel with low CPA but bad retention. "Paid social has $40 CPA and 18% M6 retention. Organic search has $5 CPA and 22% M6 retention. Even though paid social has lower CPA-per-acquisition, organic produces stickier users and lower CAC payback. I'd shift budget toward content marketing to grow organic."

That's the connect-the-dots move that earns senior credit.

The Five Mistakes

1. Mixing up open rate denominators. Recipients vs. recipients-minus-bounces. Always be explicit.

2. Computing CTR as clicks/recipients instead of clicks/opens. Two valid metrics, but they tell different stories. Specify which.

3. Ignoring organic baseline in campaign lift. A 30% volume increase isn't a 30% lift if organic was already up 20%.

4. First-touch vs. last-touch picked arbitrarily. Always state which attribution model and acknowledge the bias.

5. Treating channel volume as channel quality. Higher signups isn't better channel — it's just more spend. Always pair with quality metrics (retention, LTV).

What To Practice

Marketing analytics builds on cohort retention plus three new shapes (attribution, lift, channel ROI). The fastest path: take a real-feeling growth schema with users, marketing_spend, marketing_touches, email_sends, and orders. Write each of the 5 patterns above three times — once with first-touch attribution, once with last-touch, once with multi-segment.

Fifteen queries on a realistic schema builds more marketing analytics interview-readiness than 50 generic SQL drills. The schema fluency is the differentiator.

Practice next

Explore SQL challenges

100+ challenges across Growth, SaaS, Marketing, Product, and Finance — graded by AI, ranked by difficulty.

Explore SQL challenges

Back to all resources