If you are preparing for a Flipkart Business Analyst role, you need to be ready for one of the most rigorous e-commerce analytics interviews in India. The Flipkart Business Analyst Interview Questions you will face span GMV decomposition in SQL, supply chain SLA analysis, loyalty program ROI, and product north-star metric design โ all grounded in Flipkart’s real marketplace context. This guide covers all three rounds with exact questions, complete SQL answers, and the frameworks interviewers score you on.
Case Study
Python
Tableau
Business Strategy
Round 1 โ Flipkart Business Analyst Interview Questions: SQL & Data Analysis (45 mins)
| Questions | 5 |
| Duration | 45 minutes |
| Difficulty | MediumโHard |
| Key Topics | GMV DecompositionWindow FunctionsCohort RetentionFunnel AnalysisSLA Tracking |
The growth team wants to understand what drove the change in GMV between last month and the month before. Decompose the MoM GMV change into: new sellers (joined this month), lost sellers (active last month but zero GMV this month), and existing seller growth (active both months).
Table:
- orders(order_id, seller_id, category, gmv, order_date, status)
status = 'delivered' โ GMV should only count fulfilled orders at Flipkart.WITH curr_month AS (
SELECT
seller_id,
SUM(gmv) AS gmv_curr
FROM orders
WHERE DATE_TRUNC('month', order_date) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
AND status = 'delivered'
GROUP BY seller_id
),
prev_month AS (
SELECT
seller_id,
SUM(gmv) AS gmv_prev
FROM orders
WHERE DATE_TRUNC('month', order_date) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '2 months')
AND status = 'delivered'
GROUP BY seller_id
),
bridge AS (
SELECT
COALESCE(c.seller_id, p.seller_id) AS seller_id,
COALESCE(c.gmv_curr, 0) AS gmv_curr,
COALESCE(p.gmv_prev, 0) AS gmv_prev,
CASE
WHEN p.seller_id IS NULL THEN 'new_seller'
WHEN c.seller_id IS NULL THEN 'lost_seller'
ELSE 'existing_seller'
END AS seller_type
FROM curr_month c
FULL OUTER JOIN prev_month p USING (seller_id)
)
SELECT
seller_type,
COUNT(DISTINCT seller_id) AS seller_count,
SUM(gmv_curr - gmv_prev) AS gmv_contribution
FROM bridge
GROUP BY seller_type
ORDER BY seller_type;
FULL OUTER JOIN is essential โ a regular JOIN would miss lost sellers (present in prev but not curr). COALESCE(..., 0) converts NULLs into zeros so arithmetic works. New seller contribution = gmv_curr - 0 = gmv_curr; lost seller contribution = 0 - gmv_prev (negative, which is correct โ they dragged GMV down). This bridge pattern is the standard way Flipkart’s analytics team reports MoM GMV shifts.- How would you extend this to decompose by category as well as seller type?
- A seller had โน0 GMV in the current month but placed orders โ should they be “lost”? How do you handle returns?
Find sellers who were in the top 10% by GMV in their category for the last quarter, but whose return rate increased by more than 15 percentage points compared to the previous quarter. Flag them as high-risk sellers.
Tables:
- sellers(seller_id, category, onboarded_date)
- orders(order_id, seller_id, gmv, returns, date)
PERCENT_RANK) with conditional aggregation for return rate. Many candidates struggle with the quarter-over-quarter comparison โ they either miss the previous quarter or compute return rate incorrectly. Return rate = SUM(returns) / SUM(gmv) โ using order count would be wrong since GMV varies per order.WITH quarterly_stats AS (
SELECT
o.seller_id,
s.category,
DATE_TRUNC('quarter', o.date) AS quarter_start,
SUM(o.gmv) AS total_gmv,
ROUND(100.0 * SUM(o.returns) / NULLIF(SUM(o.gmv), 0), 2) AS return_rate_pct
FROM orders o
JOIN sellers s ON o.seller_id = s.seller_id
WHERE o.date >= DATE_TRUNC('quarter', CURRENT_DATE - INTERVAL '6 months')
GROUP BY o.seller_id, s.category, DATE_TRUNC('quarter', o.date)
),
ranked AS (
SELECT
*,
PERCENT_RANK() OVER (
PARTITION BY category, quarter_start
ORDER BY total_gmv DESC
) AS gmv_pct_rank
FROM quarterly_stats
),
curr_q AS (
SELECT * FROM ranked
WHERE quarter_start = DATE_TRUNC('quarter', CURRENT_DATE - INTERVAL '3 months')
),
prev_q AS (
SELECT seller_id, return_rate_pct AS prev_return_rate FROM ranked
WHERE quarter_start = DATE_TRUNC('quarter', CURRENT_DATE - INTERVAL '6 months')
)
SELECT
c.seller_id,
c.category,
c.total_gmv,
c.return_rate_pct AS curr_return_rate,
p.prev_return_rate,
c.return_rate_pct - p.prev_return_rate AS return_rate_delta,
'HIGH RISK' AS flag
FROM curr_q c
JOIN prev_q p ON c.seller_id = p.seller_id
WHERE c.gmv_pct_rank <= 0.10
AND (c.return_rate_pct - p.prev_return_rate) > 15
ORDER BY return_rate_delta DESC;
PERCENT_RANK() = 0 is the top seller. So <= 0.10 gives the top 10%. Use NULLIF(SUM(gmv), 0) to prevent division-by-zero for sellers with zero GMV (e.g., full returns quarter). The 15-point threshold is in percentage-point terms โ not relative change โ so you compare curr_rate - prev_rate > 15, not (curr - prev) / prev > 0.15.- How would you adjust this if a seller onboarded mid-quarter โ should they be in the comparison at all?
- What action would you recommend for flagged sellers: suppress listings, notify the account manager, or reduce search rank?
Recommended Resources to Crack Flipkart BA Interview
Two books, ~950 questions โ enough to cover every round
- 2,200+ questions ยท 23 topics
- SQL, Python, case study, statistics
- Covers all top product & e-commerce companies
- 220+ questions ยท SQL advanced, data models, DBMS
- Window functions, CTEs, complex joins
- E-commerce SQL patterns from Flipkart, Amazon, Meesho
Calculate step-by-step funnel conversion rates โ product_view โ add_to_cart โ checkout_start โ payment_done โ and identify the biggest drop-off step per category.
Table:
- events(session_id, user_id, event_type, category, ts) โ event_type โ {product_view, add_to_cart, checkout_start, payment_done}
COUNT(DISTINCT session_id) โ not COUNT(*) โ since users can trigger the same event multiple times in a session. They also test if you can find the biggest drop-off step programmatically using RANK() rather than eyeballing the output.WITH funnel AS (
SELECT
category,
COUNT(DISTINCT CASE WHEN event_type = 'product_view' THEN session_id END) AS step1_view,
COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN session_id END) AS step2_cart,
COUNT(DISTINCT CASE WHEN event_type = 'checkout_start' THEN session_id END) AS step3_checkout,
COUNT(DISTINCT CASE WHEN event_type = 'payment_done' THEN session_id END) AS step4_payment
FROM events
WHERE ts >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY category
),
conversion_rates AS (
SELECT
category,
step1_view,
step2_cart,
step3_checkout,
step4_payment,
ROUND(100.0 * step2_cart / NULLIF(step1_view, 0), 1) AS view_to_cart_pct,
ROUND(100.0 * step3_checkout / NULLIF(step2_cart, 0), 1) AS cart_to_checkout_pct,
ROUND(100.0 * step4_payment / NULLIF(step3_checkout,0), 1) AS checkout_to_payment_pct
FROM funnel
),
drop_offs AS (
SELECT
category,
'view_to_cart' AS step,
(100 - view_to_cart_pct) AS drop_off_pct
FROM conversion_rates
UNION ALL
SELECT category, 'cart_to_checkout',
(100 - cart_to_checkout_pct) AS drop_off_pct
FROM conversion_rates
UNION ALL
SELECT category, 'checkout_to_payment',
(100 - checkout_to_payment_pct) AS drop_off_pct
FROM conversion_rates
),
ranked_drops AS (
SELECT
*,
RANK() OVER (PARTITION BY category ORDER BY drop_off_pct DESC) AS drop_rank
FROM drop_offs
)
SELECT category, step AS biggest_dropoff_step, drop_off_pct
FROM ranked_drops
WHERE drop_rank = 1
ORDER BY drop_off_pct DESC;
COUNT(DISTINCT session_id) within CASE WHEN to count unique sessions at each step โ not raw event counts. The UNION ALL + RANK() pattern lets you find the worst step per category dynamically rather than comparing columns manually. At Flipkart, the cartโcheckout drop-off is typically the biggest โ often due to COD availability or coupon confusion.- How would you isolate whether the drop-off is higher on mobile vs desktop?
- If payment_done events are missing for 5% of sessions due to a tracking bug, how does that affect your funnel?
Build a 6-month cohort retention table showing what percentage of users from each acquisition month made a purchase in months 1, 2, 3, 4, 5, and 6 after their first purchase.
Table:
- orders(order_id, user_id, order_date, gmv)
WITH first_purchase AS (
SELECT
user_id,
DATE_TRUNC('month', MIN(order_date)) AS cohort_month
FROM orders
GROUP BY user_id
),
user_activity AS (
SELECT
fp.user_id,
fp.cohort_month,
DATE_TRUNC('month', o.order_date) AS activity_month,
DATEDIFF('month', fp.cohort_month,
DATE_TRUNC('month', o.order_date)) AS months_since_first
FROM first_purchase fp
JOIN orders o ON fp.user_id = o.user_id
WHERE DATE_TRUNC('month', o.order_date) >= fp.cohort_month
),
cohort_size AS (
SELECT cohort_month, COUNT(DISTINCT user_id) AS total_users
FROM first_purchase
GROUP BY cohort_month
)
SELECT
ua.cohort_month,
cs.total_users,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_first = 0 THEN ua.user_id END) / cs.total_users, 1) AS month_0,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_first = 1 THEN ua.user_id END) / cs.total_users, 1) AS month_1,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_first = 2 THEN ua.user_id END) / cs.total_users, 1) AS month_2,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_first = 3 THEN ua.user_id END) / cs.total_users, 1) AS month_3,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_first = 4 THEN ua.user_id END) / cs.total_users, 1) AS month_4,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_first = 5 THEN ua.user_id END) / cs.total_users, 1) AS month_5,
ROUND(100.0 * COUNT(DISTINCT CASE WHEN months_since_first = 6 THEN ua.user_id END) / cs.total_users, 1) AS month_6
FROM user_activity ua
JOIN cohort_size cs ON ua.cohort_month = cs.cohort_month
GROUP BY ua.cohort_month, cs.total_users
ORDER BY ua.cohort_month;
DATEDIFF('month', cohort_month, activity_month) โ not subtraction of dates โ to correctly count calendar months regardless of day-of-month variation. Flipkart typically expects to see cohort retention drop from ~100% โ 30โ40% at month 1 โ stabilising around 15โ20% by month 3.- How would you modify this to show GMV retention (% of cohort GMV retained) rather than user retention?
- What does a flat retention curve after month 2 indicate about Flipkart’s user base?
Calculate on-time delivery rate by warehouse-destination city pair. Identify pairs where SLA breach rate > 20% AND average delay > 2 days. Rank the top 10 offending pairs.
Table:
- shipments(shipment_id, order_id, warehouse_city, dest_city, promised_delivery_date, actual_delivery_date)
DATEDIFF correctly (positive delay = late), how you define “breach” (actual > promised), and whether you use NULLIF for division safety. The window function ranking at the end shows senior-level thinking โ many candidates just use WHERE and sort without producing a clean rank.WITH sla_calc AS (
SELECT
warehouse_city,
dest_city,
COUNT(*) AS total_shipments,
COUNT(CASE WHEN actual_delivery_date > promised_delivery_date THEN 1 END)
AS breached_count,
ROUND(
100.0 * COUNT(CASE WHEN actual_delivery_date > promised_delivery_date THEN 1 END)
/ NULLIF(COUNT(*), 0), 1
) AS breach_rate_pct,
ROUND(
AVG(CASE
WHEN actual_delivery_date > promised_delivery_date
THEN DATEDIFF('day', promised_delivery_date, actual_delivery_date)
END), 1
) AS avg_delay_days
FROM shipments
WHERE actual_delivery_date IS NOT NULL
GROUP BY warehouse_city, dest_city
),
filtered AS (
SELECT
*,
RANK() OVER (ORDER BY breach_rate_pct DESC, avg_delay_days DESC) AS breach_rank
FROM sla_calc
WHERE breach_rate_pct > 20
AND avg_delay_days > 2
)
SELECT
breach_rank,
warehouse_city,
dest_city,
total_shipments,
breached_count,
breach_rate_pct,
avg_delay_days
FROM filtered
WHERE breach_rank <= 10
ORDER BY breach_rank;
- How would you split the delay analysis by carrier/logistics partner to identify which vendor is responsible?
- Promised delivery dates vary by SKU category โ how would you normalise breach rates across categories?
Round 2 โ Flipkart Business Analyst Interview Questions: Business Case Study (30 mins)
| Questions | 3 |
| Duration | 30 minutes |
| Difficulty | Hard |
| Key Topics | Loyalty Program ROISupply Chain CostCannibalizationHypothesis Testing |
Flipkart Plus is a loyalty program. Plus members have 2.8ร higher GMV but the program’s ROI is declining. Design a full analysis: what metrics to look at, what hypotheses you’d test, and what recommendation you’d make.
- How would you design an A/B test to measure the true causal effect of Plus membership?
- If you cannot run an experiment, what observational method would you use?
- What is the right time window to measure LTV for a Plus member โ 3 months or 12 months?
Last-mile delivery cost has increased 18% YoY while GMV grew only 8%. Walk through your analysis framework: identify root causes using volume/mix/rate decomposition, and propose 3 actionable interventions.
Step 1: Decompose the cost increase
| Effect | Definition | How to Measure |
|---|---|---|
| Volume Effect | More shipments than last year | Orders YoY ร same avg cost per order |
| Mix Effect | Shift to more expensive routes (Tier-2, heavy SKUs) | Reweight last year’s costs at this year’s mix |
| Rate Effect | Unit cost per shipment has gone up | Residual after removing volume + mix effects |
Step 2: Identify hypotheses
- Mix shift to Tier-2/3 cities where per-km cost is higher due to sparse coverage
- Increase in large/heavy category orders (furniture, appliances) with higher weight-based shipping rates
- Vendor rate renegotiation โ logistics partner increased rates post-contract renewal
- Failed delivery attempts rising โ re-delivery costs being double-counted
Step 3: 3 actionable interventions
- Cluster delivery zones: batch deliveries in the same pin code on the same day to reduce per-order last-mile cost by 15โ20%
- Minimum order GMV for same-day delivery: require โน499 minimum for express delivery to shift cost per delivery up-front
- Negotiate multi-vendor logistics contracts: introduce a second logistics partner in high-cost pin codes to create pricing competition
- How would you measure whether the batch delivery intervention actually works โ what is your success metric?
- If you can only implement one intervention, how do you decide which has the highest ROI?
Flipkart launches a grocery vertical (Flipkart Grocery). Define how you’d measure if it’s cannibalizing electronics and fashion categories. What metrics, time windows, and statistical test would you use?
Metrics to track:
- Session share of electronics + fashion (% of total sessions) โ if this drops post-grocery launch, cannibalization may be happening
- GMV per user per category โ are existing users buying less in electronics/fashion while buying grocery?
- Category switch rate: % of users who purchased electronics in the 30 days before grocery launch but shifted spend to grocery in the 30 days after
- Wallet share per user โ if total spend is flat but category mix changed, it is cannibalization; if total spend grew, it is expansion
Time window:
- Pre-period: 60โ90 days before grocery launch
- Post-period: 60โ90 days after (avoid the first 2 weeks โ novelty effect inflates grocery numbers)
Statistical test:
- Difference-in-Differences (DiD): compare GMV change in electronics/fashion for users who adopted grocery vs users who did not. A negative coefficient on the adoption ร post interaction term = cannibalization evidence.
- Segment by user tenure โ heavy buyers are more likely to cannibalise (they have a fixed discretionary budget); new users acquired via grocery are additive.
- How would you design a control group if you cannot run a formal experiment?
- Grocery has much higher order frequency than electronics โ how does that affect your session share metric?
Round 3 โ Product Metrics & Strategy (20 mins)
| Questions | 2 |
| Duration | 20 minutes |
| Difficulty | Medium |
| Key Topics | North Star MetricKPI DesignReal-Time DashboardsAlerting |
What is the single most important metric for Flipkart’s marketplace business? Defend your choice with a framework. What are the leading indicators vs lagging indicators?
North Star Metric recommendation: Monthly Transacting Users ร Average Order Value โ essentially Effective GMV โ but segmented to filter out returns and cancellations.
More precisely: Net GMV per Monthly Active Buyer (NMAB)
- Net GMV = Gross GMV โ returns โ cancellations (measures real economic value)
- Per Monthly Active Buyer (normalises for user base size โ doesn’t inflate as you acquire low-quality users)
| Indicator Type | Metric | Why |
|---|---|---|
| North Star | Net GMV per Monthly Active Buyer | Combines marketplace health with user value |
| Leading โ Demand | Search-to-cart conversion rate | Predicts GMV 7โ14 days out |
| Leading โ Supply | Active seller count (listed in last 30 days) | Supply breadth drives conversion |
| Leading โ Retention | Day-7 repeat purchase rate | Predicts 30-day cohort retention |
| Lagging | Quarterly GMV ยท Annual customer LTV | Outcome metrics โ reflect past decisions |
| Guardrail | Return rate ยท NPS ยท Seller satisfaction score | Ensure growth doesn’t degrade quality |
- How would you set a target for this metric? What growth rate would you aim for YoY?
- Amazon uses “customer obsession” metrics โ how does that differ from GMV-based north stars?
Define 5 KPIs you’d track during Big Billion Days. How would you build a real-time dashboard and what thresholds would trigger an incident alert?
| KPI | Target | Alert Threshold | Why It Matters |
|---|---|---|---|
| GMV per Hour | โฅ โนX crore (based on prior year same hour) | < 80% of prior year same hour | Revenue health check โ catches demand or tech failures |
| Payment Success Rate | โฅ 97% | < 93% for 3+ consecutive minutes | Failed payments are direct lost revenue โ often first signal of gateway issues |
| Out-of-Stock (OOS) Rate | < 5% of top-100 SKUs | > 15% OOS on any top-20 SKU | High OOS destroys trust and deflects to competitors |
| App/Site Error Rate | < 0.5% requests returning 5xx errors | > 1% for 2+ minutes | Infrastructure failure indicator โ escalate to SRE immediately |
| Cart Abandonment Rate | < 65% (typical e-commerce baseline) | > 75% for 15+ consecutive minutes | High abandonment = pricing issue, checkout UX bug, or payment gateway slowdown |
Real-Time Dashboard design:
- Data pipeline: Kafka event streaming โ Apache Flink or Spark Streaming for 30-second aggregations โ write to a fast OLAP store (Druid or ClickHouse)
- Visualization: Grafana or internal BI tool with auto-refresh every 60 seconds; split by hour, category, platform (app vs web vs m-site)
- Alerting: PagerDuty integration โ P0 alerts (payment failure, 5xx spike) wake on-call engineers instantly; P1 alerts (OOS, cart abandonment) notify channel Slack
- Comparison panel: Always show current hour vs same hour last year and vs internal forecast โ anomaly is only meaningful relative to expectation
- If payment success rate drops to 91% at 12 PM, what is your immediate response playbook?
- How do you prevent alert fatigue when 15+ alerts fire simultaneously during a spike event?
4-Week Study Plan for Flipkart Business Analyst Interview
Flipkart tests both SQL depth and business case study breadth. The winning combination is strong e-commerce domain knowledge (GMV, CAC, LTV, supply chain SLA) paired with clean, well-commented SQL using CTEs and window functions.
Build SQL and case study foundation
Master SQL window functions, CTEs, and conditional aggregation. Read The Data Monk’s e-commerce analytics posts โ Flipkart, Meesho, Amazon. Practise the GMV decomposition and funnel analysis patterns from this guide.
2200 Most Asked Analytics Interview Questions ebook
Cover the SQL section, Python section, and all case study chapters. Focus on e-commerce metrics (GMV, cohort, SLA). End of Week 2:
Ace Any SQL Interview ebook
Work through all questions from Ace Any SQL Interview, no shortcuts. Focus on self-joins, CTEs, cohort analysis, and window function chains. End of Week 3:
Revise + timed practice
Revise first 5 chapters of the 2200 ebook. Practice top 5 case study frameworks (Volume/Mix/Rate, DiD, Funnel, Cohort, North Star) timed at 20 minutes per case. End of Week 4: