If you are targeting the Google Senior Data Analyst role, you need to master one of the most rigorous analytical interview pipelines in tech. The Google Senior Data Analyst Interview Questions test your ability to write complex BigQuery SQL for product usage analysis, design statistically sound A/B experiments for features used by billions of users, build Python pipelines for behavioral data, and define north-star metrics that withstand scrutiny from Google’s senior leadership. This guide covers all four rounds with exact questions, complete SQL and Python answers, and the frameworks Googlers score on.
Python
A/B Testing
Experimentation
Product Metrics
Round 1 — Google Senior Data Analyst Interview Questions: BigQuery SQL (60 mins)
| Questions | 4–5 |
| Duration | 60 minutes |
| Difficulty | Medium–Hard |
| Key Topics | Session AnalysisFunnel QueriesRetentionWindow FunctionsBigQuery Syntax |
The Search team wants to know which query categories (e.g., sports, health, local) have the highest zero-result search rate for the last 7 days. A search session is “zero-result” if no organic results were returned. Also compute the click-through rate (CTR) for sessions that did return results, by category. Return categories where zero-result rate is above the overall average.
Tables:
- search_sessions(session_id, user_id, query_text, query_category, results_count, session_ts)
- search_clicks(click_id, session_id, result_position, clicked_ts)
AVG(...) OVER ().WITH session_stats AS (
SELECT
s.query_category,
COUNT(*) AS total_sessions,
COUNTIF(s.results_count = 0) AS zero_result_sessions,
COUNTIF(s.results_count = 0) / COUNT(*) * 100 AS zero_result_rate,
COUNTIF(s.results_count > 0) AS sessions_with_results,
COUNT(DISTINCT c.session_id) AS sessions_with_clicks,
SAFE_DIVIDE(
COUNT(DISTINCT c.session_id),
COUNTIF(s.results_count > 0)
) * 100 AS ctr_pct
FROM search_sessions s
LEFT JOIN search_clicks c USING (session_id)
WHERE s.session_ts >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY s.query_category
),
with_avg AS (
SELECT *,
AVG(zero_result_rate) OVER () AS overall_avg_zero_rate
FROM session_stats
)
SELECT
query_category,
total_sessions,
ROUND(zero_result_rate, 2) AS zero_result_rate_pct,
ROUND(ctr_pct, 2) AS ctr_pct,
ROUND(overall_avg_zero_rate, 2) AS overall_avg_pct
FROM with_avg
WHERE zero_result_rate > overall_avg_zero_rate
ORDER BY zero_result_rate DESC;
SAFE_DIVIDE() is the BigQuery-preferred way to handle division-by-zero — it returns NULL instead of an error, which propagates cleanly through aggregations. COUNTIF() is BigQuery’s cleaner alternative to SUM(CASE WHEN ... THEN 1 ELSE 0 END). The AVG() OVER () window function computes the overall average across all categories without a subquery — this is more readable and avoids a self-join.- The zero-result rate for “health” queries tripled this week. How do you diagnose — is it a data pipeline issue, a ranking model regression, or a genuine content gap?
- You want to track this metric weekly. How would you partition the BigQuery table to avoid full scans when filtering by date?
The YouTube team defines creator “retention” as: a creator who uploaded at least one video in their first month (cohort month) is “retained” in month N if they uploaded at least one video in month N. For all creator cohorts from the last 12 months, compute the month-1, month-2, and month-3 retention rates.
Table:
- video_uploads(video_id, creator_id, upload_ts, view_count, duration_seconds)
WITH creator_cohort AS (
SELECT
creator_id,
DATE_TRUNC(MIN(upload_ts), MONTH) AS cohort_month
FROM video_uploads
GROUP BY creator_id
HAVING cohort_month >= DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 12 MONTH), MONTH)
),
monthly_uploads AS (
SELECT
v.creator_id,
DATE_TRUNC(v.upload_ts, MONTH) AS upload_month
FROM video_uploads v
GROUP BY v.creator_id, DATE_TRUNC(v.upload_ts, MONTH)
),
cohort_activity AS (
SELECT
c.cohort_month,
DATE_DIFF(u.upload_month, c.cohort_month, MONTH) AS months_since_first,
COUNT(DISTINCT u.creator_id) AS retained_creators
FROM creator_cohort c
JOIN monthly_uploads u USING (creator_id)
WHERE DATE_DIFF(u.upload_month, c.cohort_month, MONTH) BETWEEN 1 AND 3
GROUP BY c.cohort_month, DATE_DIFF(u.upload_month, c.cohort_month, MONTH)
),
cohort_size AS (
SELECT cohort_month, COUNT(*) AS cohort_total
FROM creator_cohort
GROUP BY cohort_month
)
SELECT
cs.cohort_month,
cs.cohort_total,
MAX(IF(ca.months_since_first=1, ca.retained_creators, NULL)) AS retained_m1,
MAX(IF(ca.months_since_first=2, ca.retained_creators, NULL)) AS retained_m2,
MAX(IF(ca.months_since_first=3, ca.retained_creators, NULL)) AS retained_m3,
ROUND(MAX(IF(ca.months_since_first=1, ca.retained_creators, NULL))
/ cs.cohort_total * 100, 1) AS retention_m1_pct,
ROUND(MAX(IF(ca.months_since_first=2, ca.retained_creators, NULL))
/ cs.cohort_total * 100, 1) AS retention_m2_pct,
ROUND(MAX(IF(ca.months_since_first=3, ca.retained_creators, NULL))
/ cs.cohort_total * 100, 1) AS retention_m3_pct
FROM cohort_size cs
LEFT JOIN cohort_activity ca USING (cohort_month)
GROUP BY cs.cohort_month, cs.cohort_total
ORDER BY cs.cohort_month;
IF(months_since_first=1, retained, NULL) inside MAX() is BigQuery’s pivot pattern — cleaner than multiple subqueries. DATE_DIFF(..., MONTH) handles month boundaries correctly regardless of days per month. The LEFT JOIN from cohort_size to cohort_activity preserves cohort months where nobody was retained (recent cohorts without 3 months of follow-up data) — essential for correctly showing NULL instead of implying 0% for incomplete cohorts.- Month-3 retention for the March 2026 cohort shows NULL. How do you communicate this to the PM who built the creator retention dashboard?
- How would you extend this to measure “video quality” retention — creators who uploaded AND got at least 1000 views in month N?
The Maps team tracks a 3-step funnel: Search → Place View → Navigation Start. For each city and day this week, compute: step conversion rates, the step with the largest absolute drop in conversions, and whether the overall funnel rate improved or declined vs last week.
Table:
- maps_events(event_id, user_id, city, event_type, event_ts)
event_type:'search','place_view','nav_start'
WITH daily_funnel AS (
SELECT
city,
DATE(event_ts) AS event_date,
COUNTIF(event_type = 'search') AS searches,
COUNTIF(event_type = 'place_view') AS place_views,
COUNTIF(event_type = 'nav_start') AS nav_starts,
SAFE_DIVIDE(COUNTIF(event_type='place_view'),
COUNTIF(event_type='search')) * 100 AS s_to_pv_rate,
SAFE_DIVIDE(COUNTIF(event_type='nav_start'),
COUNTIF(event_type='place_view')) * 100 AS pv_to_nav_rate,
SAFE_DIVIDE(COUNTIF(event_type='nav_start'),
COUNTIF(event_type='search')) * 100 AS overall_rate
FROM maps_events
WHERE event_ts >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 14 DAY)
GROUP BY city, DATE(event_ts)
),
with_wow AS (
SELECT
*,
LAG(overall_rate, 7) OVER (PARTITION BY city ORDER BY event_date) AS prev_week_rate,
CASE
WHEN (searches - place_views) >= (place_views - nav_starts) THEN 'Search→PlaceView'
ELSE 'PlaceView→NavStart'
END AS biggest_drop_step
FROM daily_funnel
)
SELECT
city, event_date,
searches, place_views, nav_starts,
ROUND(s_to_pv_rate, 1) AS search_to_placeview_pct,
ROUND(pv_to_nav_rate, 1) AS placeview_to_nav_pct,
ROUND(overall_rate, 1) AS overall_funnel_rate_pct,
ROUND(prev_week_rate, 1) AS prev_week_overall_pct,
CASE WHEN overall_rate > prev_week_rate THEN '▲ IMPROVED'
WHEN overall_rate < prev_week_rate THEN '▼ DECLINED'
ELSE '— STABLE' END AS wow_trend,
biggest_drop_step
FROM with_wow
WHERE event_date >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
ORDER BY city, event_date;
LAG(overall_rate, 7) with PARTITION BY city ORDER BY event_date gives you the same city’s same weekday from one week ago — more correct than averaging last week because daily traffic has strong day-of-week patterns (Fridays have more nav_starts). The biggest_drop_step logic compares absolute drop in user counts (not rates) because absolute volume matters more for capacity planning.- The Place View → Nav Start rate is 40% for Bangalore but 65% for Mumbai. What hypotheses would you test to explain the gap?
- A new UX change was shipped on Wednesday. How do you control for this when comparing this week to last week?
Round 2 — Google Senior Data Analyst Interview Questions: Experimentation & Statistics (60 mins)
| Questions | 3–4 |
| Duration | 60 minutes |
| Difficulty | Hard |
| Key Topics | A/B TestingNovelty EffectSample Ratio MismatchSequential TestingHoldback Tests |
You ran a 2-week A/B test on a new Search result ranking algorithm. The treatment group shows a statistically significant 2% CTR lift (p=0.01). Your PM is excited and wants to ship immediately. Walk through your validation checklist before declaring the experiment a success.
Pre-ship validation checklist:
1. Sample Ratio Mismatch (SRM) check
Verify actual traffic split matches planned split (50/50). Use a chi-squared test on session counts per group. A mismatch suggests assignment bias — if treatment got 52% of traffic, the experiment is compromised regardless of results.
2. Novelty effect analysis
Plot CTR by day within the experiment. If CTR starts at +5% on Day 1 and trends down to +1% by Day 14, the lift is a novelty effect — users are clicking on new result layouts out of curiosity, not because they find them more relevant. Google requires at least 2 weeks AND a stable trend line before declaring a winner.
3. Guardrail metrics — check ALL of these:
- Long-click rate (clicks followed by ≥30 sec on the result page — proxy for result quality)
- Pogo-stick rate (clicking a result then immediately returning to Search — signals poor result quality)
- Ads CTR — did organic CTR increase at the expense of ads CTR? (zero-sum shift, not real user value)
- Next search rate — did users search again immediately after clicking? (high rate = result didn’t answer their query)
4. Segment analysis
Break the 2% lift by: (a) device type — mobile vs desktop; (b) query category; (c) user locale. If the lift is entirely driven by one segment (e.g., mobile Hindi queries) and is negative for others, a global ship is premature.
5. Duration check
2 weeks may be too short for weekly seasonality to average out. Check if the treatment and control groups had similar baseline metrics in the pre-experiment period (AA test validation).
- The PM pushes back and says “we have 10,000 experiments running — we can’t be this rigorous for all of them.” How do you tier experiments by risk level?
- SRM check fails — treatment got 48% of traffic instead of 50%. Can you recover the experiment? What statistical adjustment would you apply?
The Google Maps PM asks you to define a north-star metric for Maps that captures the full value Maps delivers to users. Walk through your framework and propose one primary metric, two supporting metrics, and two guardrail metrics.
- How do you measure “destination reached” if Maps doesn’t always know when a user arrived? What signal approximates this from the event log?
- The Maps team in India has a much lower search-to-navigation rate than the US team. Does this mean Indian Maps is worse? What else could explain it?
📚 Recommended Resources to Crack Google SDA Interview
Used by 40,000+ analysts to crack FAANG and product company interviews
- 2,200+ questions · 23 topics
- SQL, Python, A/B Testing, Product Metrics
- FAANG-level scenario-based questions
- 220+ questions · SQL advanced, data models, DBMS
- BigQuery-compatible window functions and CTEs
- Covers Google-style product analytics query patterns
4-Week Prep Plan: Crack Google Senior Data Analyst Interview
BigQuery SQL + Google Product Context
Master BigQuery-specific syntax: COUNTIF, SAFE_DIVIDE, DATE_TRUNC, DATE_DIFF, IF() pivot patterns. Study Google’s product suite deeply — Search, Maps, YouTube, Ads — and their core metrics. Read TDM posts on experimentation frameworks for FAANG.
2200 Most Asked Analytics Interview Questions — FAANG Sections
Cover SQL, A/B testing, product metrics, and case study chapters. Focus on experimentation design, north-star metric definition, and guardrail metrics questions. 100+ questions/day. End of Week 2: First mock interview →
Ace Any SQL Interview + Experimentation Deep Dive
Complete all 220+ questions in Ace Any SQL Interview. Practice 2 experiment design problems per day — include SRM check, novelty effect, guardrail metric selection. End of Week 3: Second mock interview →
Timed Practice + Final Revision
Revise first 5 chapters of 2200 ebook. Timed SQL: 15 min per BigQuery query. Timed experiment design: 25 min per case. End of Week 4: Final mock interview →