Home β€Ί Case Study β€Ί Uber Data Scientist Interview Questions and Answers 2026
Case Study Company Interview Questions Data Analyst Interview Questions

Uber Data Scientist Interview Questions and Answers 2026

If you are preparing for an Uber Data Scientist interview, you are entering one of the most rigorous DS pipelines in the tech industry. The Uber Data Scientist Interview Questions test your ability to design surge pricing experiments, detect demand-supply imbalances using statistics, build ML models for ETA prediction, and run A/B tests on marketplace features β€” all in the context of real ride-sharing operations at scale. This guide covers all four rounds with exact questions, complete SQL and Python answers, and the frameworks that Uber interviewers score you on.

Uber
Data Scientist
Marketplace Analytics / Rider & Driver ML
SQL
Python
Machine Learning
Statistics
A/B Testing
Salary Range
β‚Ή22–38 LPA
2–5 Years Exp

Round 1 β€” Uber Data Scientist Interview Questions: SQL & Data Analysis (45 mins)

SQL Analytical Round (Marketplace Focus)
Uber’s SQL round tests your ability to extract insights from ride-sharing data β€” trips, drivers, riders, surge events. You are expected to write window functions, detect supply-demand gaps, and compute marketplace metrics like fulfillment rate and driver utilization. The interviewer looks for reasoning about scale β€” how would your query behave on 50 million daily trips?
Questions 4
Duration 45 minutes
Difficulty Medium–Hard
Key Topics Fulfillment RateDriver UtilizationSurge AnalysisWindow FunctionsCohort Retention
Q1 β€” Rider Cohort Retention: Week-Over-Week Drop-off
Hard⏱ 20 min

The growth team wants to understand how well Uber retains first-time riders. For riders who completed their first ride in Week 1 of a given month, compute their retention rate in each of the next 4 weeks. A rider is retained in a week if they complete at least one ride that week.

Table:

  • trips(trip_id, rider_id, driver_id, city, request_ts, end_ts, status, fare_usd)
🎯 What the interviewer tests
Can you build a cohort table in SQL? The key is identifying the “Week 1” cohort using MIN(request_ts) per rider, then LEFT JOINing subsequent week activity. Many candidates forget to LEFT JOIN (using INNER JOIN instead), which drops riders who churned and makes retention look artificially high. The interviewer will explicitly ask you to show weeks where riders are absent too.

WITH cohort AS (
    SELECT
        rider_id,
        DATE_TRUNC('week', MIN(request_ts))  AS cohort_week
    FROM trips
    WHERE status = 'completed'
    GROUP BY rider_id
    HAVING DATE_TRUNC('week', MIN(request_ts))
           = DATE_TRUNC('week', '2026-01-01'::date)   -- Week 1 of target month
),
weeks AS (
    SELECT generate_series(1, 4) AS week_num
),
activity AS (
    SELECT
        t.rider_id,
        (DATE_TRUNC('week', t.request_ts) - c.cohort_week) / 7  AS weeks_since_first
    FROM trips t
    JOIN cohort c USING (rider_id)
    WHERE t.status = 'completed'
      AND t.request_ts >= c.cohort_week + INTERVAL '7 days'
    GROUP BY t.rider_id, c.cohort_week, DATE_TRUNC('week', t.request_ts)
)
SELECT
    w.week_num,
    COUNT(DISTINCT a.rider_id)                                AS retained_riders,
    COUNT(DISTINCT c.rider_id)                                AS cohort_size,
    ROUND(COUNT(DISTINCT a.rider_id)::numeric
          / COUNT(DISTINCT c.rider_id) * 100, 1)             AS retention_pct
FROM weeks w
CROSS JOIN cohort c
LEFT JOIN activity a
    ON a.rider_id = c.rider_id
   AND a.weeks_since_first = w.week_num
GROUP BY w.week_num
ORDER BY w.week_num;
πŸ’‘ Key insight
The CROSS JOIN cohort before LEFT JOIN activity is what ensures weeks with zero retained riders still appear in the output. generate_series(1,4) creates synthetic week rows so the query handles “no riders in Week 3” correctly. If you used an INNER JOIN, any week with 0% retention would vanish from results β€” that is a critical Uber interviewer flag.
πŸ” Common Follow-Up Questions
  • How would you break this down by city to identify where Uber has the biggest Week 2 drop-off?
  • A rider requested a ride but no driver accepted. Does that count as “active” for retention purposes? What data would you check?
Q2 β€” Driver Utilization: Earnings Per Online Hour by City
Medium⏱ 15 min

Compute driver earnings per online hour for each city for last week. A driver is “online” when their app status is available or on_trip. Only count completed trips for earnings. Flag cities where earnings/hour dropped more than 10% week-over-week.

Tables:

  • driver_status_log(driver_id, city, status, event_ts)
  • trips(trip_id, driver_id, city, end_ts, status, driver_earnings_usd)
🎯 What the interviewer tests
Can you compute session duration from event logs? Most candidates struggle with LEAD(event_ts) to calculate time spent in each status. The interviewer specifically checks whether you filter only available + on_trip statuses (not offline) for online time. The WoW comparison using CTEs is a bonus that shows analytical maturity.

WITH status_sessions AS (
    SELECT
        driver_id,
        city,
        status,
        event_ts                                                      AS start_ts,
        LEAD(event_ts) OVER (PARTITION BY driver_id ORDER BY event_ts) AS end_ts
    FROM driver_status_log
),
online_hours AS (
    SELECT
        city,
        DATE_TRUNC('week', start_ts)                                   AS week_start,
        SUM(EXTRACT(EPOCH FROM (end_ts - start_ts)) / 3600)            AS total_online_hrs
    FROM status_sessions
    WHERE status IN ('available', 'on_trip')
      AND end_ts IS NOT NULL
    GROUP BY city, DATE_TRUNC('week', start_ts)
),
earnings AS (
    SELECT
        city,
        DATE_TRUNC('week', end_ts)  AS week_start,
        SUM(driver_earnings_usd)    AS total_earnings
    FROM trips
    WHERE status = 'completed'
    GROUP BY city, DATE_TRUNC('week', end_ts)
),
weekly AS (
    SELECT
        e.city,
        e.week_start,
        e.total_earnings / NULLIF(o.total_online_hrs, 0) AS earnings_per_hr
    FROM earnings e
    JOIN online_hours o USING (city, week_start)
)
SELECT
    curr.city,
    curr.earnings_per_hr                                        AS curr_week_eph,
    prev.earnings_per_hr                                        AS prev_week_eph,
    ROUND((curr.earnings_per_hr - prev.earnings_per_hr)
          / NULLIF(prev.earnings_per_hr, 0) * 100, 1)          AS pct_change,
    CASE WHEN curr.earnings_per_hr < prev.earnings_per_hr * 0.9
         THEN 'FLAG' ELSE 'OK' END                             AS status
FROM weekly curr
JOIN weekly prev
    ON curr.city = prev.city
   AND curr.week_start = prev.week_start + INTERVAL '7 days'
WHERE curr.week_start = DATE_TRUNC('week', CURRENT_DATE - INTERVAL '7 days')
ORDER BY pct_change;
πŸ’‘ Key insight
LEAD(event_ts) converts the event log into sessions. NULLIF(total_online_hrs, 0) prevents division-by-zero if a driver had no online time recorded. The 10% threshold flag catches cities where the utilization model is degrading β€” at Uber this often signals a supply shortage or price elasticity shift that the PM team needs to investigate.
πŸ” Common Follow-Up Questions
  • A driver went online at 11:58 PM and offline at 12:05 AM β€” how do you handle sessions that cross midnight in your week boundary logic?
  • How would you identify drivers in the bottom 10th percentile of earnings/hour who might churn? What table would you add?
Q3 β€” Surge Pricing: Identify Under-Served Zones
Hard⏱ 20 min

The pricing team wants to identify zones where demand consistently outstrips supply during peak hours (7–9 AM, 5–8 PM weekdays). For each zone and peak-hour slot, compute the unfulfilled request rate = requests with no driver acceptance within 5 minutes / total requests. Return zones where this rate exceeded 15% on more than 3 days in the last 30 days.

Tables:

  • ride_requests(request_id, rider_id, zone_id, request_ts, status)
  • trip_matches(request_id, driver_id, match_ts)
🎯 What the interviewer tests
Can you join request and match tables with a time constraint? The 5-minute acceptance window is the trap β€” you need match_ts - request_ts <= INTERVAL '5 minutes' inside the JOIN condition, not a WHERE clause (otherwise LEFT JOINs lose their unmatched rows). The interviewer also checks whether you correctly filter peak hours using EXTRACT(HOUR FROM request_ts) and exclude weekends.

WITH peak_requests AS (
    SELECT
        r.request_id,
        r.zone_id,
        r.request_ts::date                  AS req_date,
        CASE
            WHEN EXTRACT(HOUR FROM r.request_ts) BETWEEN 7 AND 8  THEN 'morning'
            WHEN EXTRACT(HOUR FROM r.request_ts) BETWEEN 17 AND 19 THEN 'evening'
        END                                 AS peak_slot,
        m.match_ts
    FROM ride_requests r
    LEFT JOIN trip_matches m
        ON m.request_id = r.request_id
       AND m.match_ts - r.request_ts <= INTERVAL '5 minutes'
    WHERE r.request_ts >= CURRENT_DATE - INTERVAL '30 days'
      AND EXTRACT(DOW FROM r.request_ts) BETWEEN 1 AND 5   -- Mon–Fri
      AND (
          EXTRACT(HOUR FROM r.request_ts) BETWEEN 7 AND 8
       OR EXTRACT(HOUR FROM r.request_ts) BETWEEN 17 AND 19
      )
),
daily_zone AS (
    SELECT
        zone_id,
        req_date,
        peak_slot,
        COUNT(*)                                                  AS total_requests,
        SUM(CASE WHEN match_ts IS NULL THEN 1 ELSE 0 END)        AS unfulfilled,
        ROUND(SUM(CASE WHEN match_ts IS NULL THEN 1.0 ELSE 0 END)
              / COUNT(*) * 100, 1)                               AS unfulfilled_rate
    FROM peak_requests
    GROUP BY zone_id, req_date, peak_slot
)
SELECT
    zone_id,
    peak_slot,
    COUNT(*) FILTER (WHERE unfulfilled_rate > 15)    AS high_miss_days,
    ROUND(AVG(unfulfilled_rate), 1)                  AS avg_miss_rate
FROM daily_zone
GROUP BY zone_id, peak_slot
HAVING COUNT(*) FILTER (WHERE unfulfilled_rate > 15) > 3
ORDER BY high_miss_days DESC, avg_miss_rate DESC;
πŸ’‘ Key insight
Moving the 5-minute time filter into the JOIN condition (not WHERE) preserves all requests in the LEFT JOIN β€” unmatched requests get match_ts = NULL, which the SUM(CASE WHEN match_ts IS NULL…) counts as unfulfilled. COUNT(*) FILTER (WHERE …) is a cleaner PostgreSQL idiom than a nested CASE. The HAVING clause is the final precision filter for chronic under-served zones.
πŸ” Common Follow-Up Questions
  • How would you determine if adding more incentive bonuses for drivers in those zones during peak hours actually improved the fulfillment rate? Design the experiment.
  • A zone has 95% fulfillment but riders are cancelling within 2 minutes of match β€” is that a supply problem? How do you diagnose it?


Round 2 β€” Uber Data Scientist Interview Questions: Statistics & A/B Testing (50 mins)

Statistics & Experimentation Round
This round goes deep on your statistical foundations β€” the kind of thinking that prevents Uber from shipping harmful product changes to millions of riders and drivers. You will face questions on experiment design, variance reduction (CUPED), network effects in two-sided marketplaces, and detecting outliers in trip fare distributions. No formulas are given β€” you must explain from first principles.
Questions 4
Duration 50 minutes
Difficulty Hard
Key Topics A/B TestingCUPEDNetwork EffectsHypothesis TestingVariance Reduction
Q4 β€” Surge Pricing A/B Test: Why Random Assignment Fails
Hard⏱ 20 min

Uber wants to test a new surge pricing algorithm. You propose a standard rider-level A/B test: half the riders see the new surge, half see the old one. Your manager says this will produce biased results. Explain why, and propose a valid experiment design.

🎯 What the interviewer tests
This is a classic marketplace experiment trap. The interviewer wants to see if you understand SUTVA violation (Stable Unit Treatment Value Assumption). In a two-sided marketplace, assigning riders randomly means a control rider and a treatment rider compete for the same driver pool. If the treatment raises surge and fewer riders in treatment request, control riders now have MORE drivers available β€” so control performance improves artificially, making the new algorithm look worse than it is.

The core problem is interference between units β€” rides are not independent. When you run a rider-level A/B test for surge pricing:

  • Treatment riders face higher surge β†’ some cancel β†’ more drivers become available
  • Control riders (same city, same time) now get faster pickups and lower fares due to the excess supply they did not “earn”
  • The control group’s metrics are inflated by the treatment, making the new algorithm look worse

Valid design options:

Experiment Design Options for Marketplace Tests
Geographic Holdout
Assign entire cities or hexagons to treatment/control. No cross-contamination. Works best when cities are geographically isolated. Trade-off: city-level confounders (local events, weather) can bias results.
Time-Based Switchback
Alternate entire city between treatment and control every 30–60 minutes. Reduces supply-demand imbalance. Trade-off: carry-over effects if driver/rider behavior adapts within the window.
Driver-Side Randomization
If the change only affects how drivers are matched/priced, randomize at driver level. This avoids contaminating the rider-facing supply pool. Only valid if rider experience is truly independent.
Synthetic Control
Run the new algorithm in one city and model a synthetic control using pre-period data from similar cities. Good for long-horizon tests but requires strong assumptions about market similarity.

Uber typically uses geographic holdout with hexagonal cells (H3 grid) or interleaved switchback tests for pricing experiments because they minimize spillover while preserving statistical power.

πŸ” Common Follow-Up Questions
  • If you use a geographic holdout and the treatment city has a higher baseline cancellation rate, how do you control for that pre-existing difference?
  • How do you compute the minimum detectable effect for a switchback test where the variance per time slot is 3x higher than a standard A/B test?
Q5 β€” CUPED: Reduce Variance in Uber’s ETA Experiment
Hard⏱ 18 min

You are running an A/B test on a new ETA prediction model. Pre-experiment data shows high rider-level variance in trip frequency. Explain how CUPED (Controlled-experiment Using Pre-Experiment Data) works and implement it in Python to reduce variance in the metric trips_per_day.

🎯 What the interviewer tests
CUPED is Uber’s go-to variance reduction technique. The interviewer wants to see if you know that CUPED works by subtracting the pre-experiment covariate after regressing it out, not by simply stratifying. The key formula is Y_cuped = Y - theta * (X - E[X]) where theta = Cov(Y, X) / Var(X). You should be able to explain why this reduces variance: it removes the part of Y explained by the stable pre-period behavior X.

import pandas as pd
import numpy as np
from scipy import stats

# df columns: rider_id, group ('control'/'treatment'),
#             trips_pre (pre-experiment avg trips/day),
#             trips_post (experiment avg trips/day)

def apply_cuped(df, metric_col='trips_post', covariate_col='trips_pre'):
    """Apply CUPED variance reduction to an experiment metric."""
    X = df[covariate_col].values
    Y = df[metric_col].values

    # Estimate theta via OLS: Cov(Y, X) / Var(X)
    theta = np.cov(Y, X, ddof=1)[0, 1] / np.var(X, ddof=1)

    # Compute CUPED-adjusted metric
    df = df.copy()
    df['Y_cuped'] = Y - theta * (X - X.mean())
    return df, theta

df, theta = apply_cuped(df)
print(f"Theta: {theta:.4f}")

# Variance reduction
var_original = df['trips_post'].var()
var_cuped    = df['Y_cuped'].var()
print(f"Variance reduction: {(1 - var_cuped/var_original)*100:.1f}%")

# t-test on CUPED metric
control   = df[df['group'] == 'control']['Y_cuped']
treatment = df[df['group'] == 'treatment']['Y_cuped']
t_stat, p_val = stats.ttest_ind(treatment, control)
print(f"t-stat: {t_stat:.4f}, p-value: {p_val:.4f}")

# Effect size
effect = treatment.mean() - control.mean()
print(f"ATE (CUPED): {effect:.4f} trips/day")
πŸ’‘ Key insight
CUPED reduces variance without introducing bias because E[X - E[X]] = 0 β€” subtracting the covariate adjustment has zero expected effect on the true treatment effect. The variance reduction factor equals the squared correlation between Y and X: if r = 0.7, you reduce variance by 49%, which halves required sample size. At Uber, CUPED on pre-period trips/day typically achieves 30–50% variance reduction, cutting experiment runtime from 3 weeks to under 2.
πŸ” Common Follow-Up Questions
  • What if a rider had no trips in the pre-experiment period? How do you handle the missing covariate β€” impute with 0, drop the row, or use a different baseline?
  • Can you use multiple pre-period covariates (trips, cancellations, app opens) in CUPED? What method handles this?


Round 3 β€” Uber Data Scientist Interview Questions: ML System Design (60 mins)

Machine Learning Design Round
Uber’s ML design round asks you to build an end-to-end ML system β€” from data sourcing to feature engineering to deployment. You are judged on whether you understand the business goal, choose the right model for the constraints, think about real-time vs batch, and define how you will monitor the system post-launch. No coding is expected β€” whiteboard-level reasoning is sufficient.
Questions 2
Duration 60 minutes
Difficulty Hard
Key Topics ETA PredictionChurn PredictionFeature EngineeringModel MonitoringOnline vs Batch
Q6 β€” Design a Driver Churn Prediction System
Hard⏱ 30 min

Uber is losing 18% of drivers annually. Design a driver churn prediction model that identifies drivers likely to stop driving in the next 30 days, so the retention team can intervene with targeted incentives before churn happens.

🎯 What the interviewer tests
This is a classic churn ML design question. The interviewer evaluates: (1) Can you define churn precisely β€” “no trip in 30 days” is ambiguous if the driver is seasonally inactive. (2) Do you think about the right objective β€” optimizing for precision or recall depends on cost of false positives (wasted incentives) vs false negatives (lost drivers). (3) Do you know about data leakage β€” features computed after the target period are not valid.

Step 1 β€” Define churn precisely: A driver is churned if they complete zero trips in the next 30 days AND did not self-deactivate (voluntary exit) AND their account is not suspended. We exclude seasonal drivers (those with known seasonal patterns in the historical data) from the model.

Step 2 β€” Feature engineering (as of prediction date T):

  • Recency features: Days since last trip, days since last login, days since last surge earnings
  • Frequency features: Trips per week (last 4 weeks), weekly active hours, weekend vs weekday ratio
  • Earnings features: 30-day earnings trend (slope), earnings vs city median ratio, tip rate
  • Experience features: Days since onboarding, total lifetime trips, rating trend (last 20 trips)
  • Behavioral signals: App opens without trips (frustration signal), cancellation rate last 14 days, support tickets last 30 days

Step 3 β€” Model choice: XGBoost or LightGBM (handles mixed feature types, missing values naturally, fast training). Binary classification with probability output so the retention team can tier drivers by risk score (high/medium/low risk buckets).

Step 4 β€” Training setup:

  • Training window: 12 months of historical data, sliding windows to avoid temporal leakage
  • Label: 1 if driver churned 30 days after feature snapshot, 0 otherwise
  • Class imbalance: ~82% non-churners β†’ use class weights or SMOTE, optimize for F1 at business-relevant threshold

Step 5 β€” Deployment & monitoring: Batch scoring daily (not real-time β€” churn is a slow process). Monitor: feature drift via PSI (Population Stability Index), label drift via weekly churn rate, model calibration via expected vs actual churn rate per score decile.

πŸ’‘ Key insight
The business impact framing matters: if the incentive budget is β‚Ή500/driver and lifetime value of a retained driver is β‚Ή20,000, you can afford to intervene if model precision β‰₯ 2.5%. At Uber, precision@30% recall is typically the operating point β€” they prefer not to miss high-risk drivers even at the cost of some wasted incentives. Make sure to mention this trade-off explicitly.
πŸ” Common Follow-Up Questions
  • How do you measure whether the intervention (incentive sent to predicted churners) actually caused drivers to stay, vs they would have stayed anyway?
  • A driver completed 1 trip last month vs 50 trips 6 months ago β€” which feature captures this better: recency or a time-weighted frequency score?


πŸ“š Recommended Resources to Crack Uber DS Interview

Used by 40,000+ analysts & data scientists to crack product company interviews

⭐ Bestseller

2200 Most Asked Analytics Interview Questions
  • 2,200+ questions Β· 23 topics
  • ML, Statistics, SQL, Python, A/B Testing
  • Scenario-based questions with detailed answers
For: anyone targeting DS roles at product companies
β‚Ή1,999
β‚Ή7,999

Buy Now

πŸ”₯ SQL Focused

Ace Any SQL Interview
  • 220+ questions Β· SQL advanced, data models, DBMS
  • Window functions, CTEs, query optimization
  • Covers Uber-style marketplace query patterns
For: anyone who wants to crack any SQL round
β‚Ή799

Buy Now

4-Week Prep Plan: Crack Uber Data Scientist Interview

WEEK 1

Build Foundation β€” SQL, Python, Statistics Basics

Master window functions, CTEs, and marketplace query patterns (cohort retention, funnel analysis, utilization metrics). Refresh statistics: hypothesis testing, p-values, confidence intervals, CLT. Study Uber’s marketplace model β€” demand/supply balance, surge pricing mechanics, driver incentive structure. Read TDM’s guide on A/B testing for two-sided marketplaces.

WEEK 2

2200 Most Asked Analytics Interview Questions β€” DS & ML Sections

Cover SQL, Python, case study, statistics, and ML sections of the 2200 questions ebook. Focus on the experiment design and A/B testing chapters β€” these are Uber’s differentiation from standard DA roles. Aim for 100+ questions per day. End of Week 2: First mock interview β†’

WEEK 3

Ace Any SQL Interview β€” Full Coverage + ML Design Practice

Complete the Ace Any SQL Interview ebook cover to cover β€” 220+ questions. Practice 2 ML system design questions per day (churn, ETA, pricing models). For each, walk through: problem definition β†’ data β†’ features β†’ model β†’ evaluation β†’ deployment β†’ monitoring. End of Week 3: Second mock interview β†’

WEEK 4

Timed Practice + Revision

Revise first 5 chapters of 2200 ebook. Timed practice: 15 min per SQL, 20 min per case study, 25 min per ML design. Simulate real Uber interview pacing β€” they move fast. End of Week 4: Final mock interview β†’

πŸ“… Book 1:1 mock interview β€” get feedback from a practitioner who’s interviewed at product companies: Book Mock Interview
~950 questions + 3 mock interviews = enough to crack any DS role at a product company. Or take the mentorship program (packages below).

Scroll to Top