OYO Rooms Data Analyst INTERVIEW Questions

Company – OYO
Designation – Data Analyst

Year of Experience required – 0 to 4 years
Technical expertise – SQL, Python, Case Study, and Statistics
Salary offered – 10 to 18 LPA (no Stocks, 10% variable) – 80% hike

Number of Rounds – 4

There were 4 to 5 rounds in the recruitment process, the number of rounds depend on the candidate’s performance in the technical round.

Round 1 – Written SQL round
Round 2 – SQL (based on the questions asked in the first round)
Round 3 – Project-based questions and statistics (basic)
Round 4 – Case Study
Round 5 – Hiring Manager as well as case study

Round 1 – Written SQL round

There were 4 SQL questions(mostly from Hacker Rank) that need to be solved in 1 hour.
Question split:-
– 2 easy
– 1 medium
– 2 hard

List of Hacker Rank questions to practice before the interview
– Interview Questions
– 15 Days of Learning SQL

Medium-Level Questions:-
– New companies
– Occupations

For easy questions, concentrate on the basics of rank, lead, lag, and aggregate functions.

Round 2 – SQL Interview

This round was mostly around the written questions asked in the previous round and the approach of your solution. You need to have at least 3 correct answers to get to this round

Tips – Concentrate on communicating the approach, the questions in this round is completely on the 5 written questions, so you can revise the approach or concepts of these questions before the second round.

Round 3 – Project Based Questions and Statistics

I had a project in Machine Learning (ARIMA forecasting) so questions were mostly around the problem that we were trying to solve and some statistics concepts:-
– What is the p-value?
– What is correlation? Give an example of a negative correlation
– Complete walk-through of the ARIMA model
– What is multicollinearity?
– Difference between regression and classification model
– What is the degree of freedom?

Questions were mostly based on the type of project that you had written in your resume and the statistics or concepts associated with it.

So, for this round do prepare your project in as much detail as possible

Round 4 – Case Study

The technical rounds were the most important rounds. If you have performed decently in the first 3 rounds then there is a high chance of converting the role.

Case Study asked to me – How can Netflix increase its revenue by 50% in the next couple of years?
It was a mix of guesstimates and business case studies.

So, I started with some approx numbers and their current split.
For Example – Netflix has a total revenue of $100 Million and they are currently in 4 verticals and 10 countries. The current verticals are Hollywood, Bollywood, TV Series, and Other Country shows. The 10 countries are India, the USA, the UK, and 7 more small population countries.
Assumption – India has 60% of the total revenue and 100% of the revenue is coming from Bollywood movies.

After a set of assumptions, we had to discuss the approach, the important points that we discussed were:-
– Moving to or acquiring already performing OTT or their most-watched series
– Advertisement to screen time ratio. To either increase the advertisement length or the frequency of it in a show or movie
– Reducing the number of users that can use one subscription in parallel
– Making a provision of taking the phone numbers that would be associated with one account at the time a user is buying the subscription. This will reduce the frequency of distribution of subscription

There were discussions on each of these points, you just need to bring as many diverse points in the discussion as possible. Do comment your approach in the comment box below.

Round 5 – Hiring Manager Round

This round was mostly around cultural fit wherein the candidate’s previous experience was checked along with the work culture he/she was working in.
But, I was asked one more question i.e. to decide the price of a micro stay in OYO rooms. SO, OYO rooms were moving to a micro stay model where you can book a room for 6-12 hours, so the question was to have a dynamic rate charter for the booking of the room.

My approach was to have a Linear Regression model to get the rate of the room. And the independent variables that I suggested were:-
– Daily price of the room
– Day of booking
– Price of the adjacent rooms
– Time of booking
– Customer Life Time Value who is booking the room
– Number of rooms and number of booked rooms for that day
– Holiday season impact

OYO SQL Interview Questions

There were 10+ SQL questions, 6-7 easy/theoretical , a couple of medium problem and 1 hard problem.
The hard problem was picked directly from Hacker Rank, so practice all the problems.
The medium difficulty problems were like the one give below:

Question 1: You have data on people have applied for a lottery ticket. The data consists of their name and ticket number. You have to choose winners by selecting the people present in the alternate rows (the first winner starting from row number 3). Write a query to make things easy to select the winners.

Answer:

select *
from (select name, ROW_NUMBER() over (order by ticket_no) as srNo
from db) t
where (t.srNo % 2) = 1

Question 2: Find all the students who either are male or live in Mumbai ( have Mumbai as a part of their address).
Answer: Select name
From students
Where lower(gender) in (‘male’,’m’)
Or lower(address) = ‘%mumbai%’

Question 3: Can you join two table without any common column?
Answer: Yes we can do cross join without any common column.
Eg: We have Roll Number, Name of Students in Table A and their Class (let’s say 5th) in Table B.
21
We will use cross join to append class against each student.

SELECT B.CLASS,A.ID,A.NAME
FROM A, B
WHERE 1=1

Question 4:

Select case when null=null then ‘Amit’ else ‘Rahul’ end from dual. What will be the output of the above query?
Answer: The Null value has a memory reference.2 Null values cannot have same memory Reference. So output will be ‘Rahul’.

Question 5: List the different types of relationships in SQL.

There are different types of relations in the database:

  • One-to-One – This is a connection between two tables in which each record in one table corresponds to the maximum of one record in the other.
  • One-to-Many and Many-to-One – This is the most frequent connection, in which a record in one table is linked to several records in another.
  • Many-to-Many – This is used when defining a relationship that requires several instances on each sides.
  • Self-Referencing Relationships – When a table has to declare a connection with itself, this is the method to employ.

Question 6: What are the differences between OLTP and OLAP?

Answer: OLTP stands for online transaction processing, whereas OLAP stands for online analytical processing. OLTP is an online database modification system, whereas OLAP is an online database query response system.

Question 7: What is the usage of the NVL() function?

Answer: You may use the NVL function to replace null values with a default value. The function returns the value of the second parameter if the first parameter is null. If the first parameter is anything other than null, it is left alone.

OYO Case Study Questions

Case Study – Suggest as many important KPIs as possible that you want to put on the CXOs dashboard

Following were the suggested KPIs

  • Average Daily Rate (ADR)
  • Occupancy rate
  • Revenue per Available Room (RevPAR)
  • Gross Operating Profit per Available Room (GOPPAR)
  • Average Length of Stay (ALOS)
  • Customer Acquisition Cost (CAC)
  • Customer Lifetime Value (CLV)
  • Net Promoter Score (NPS)
  • Online Reputation Score (ORS)
  • Room Revenue Contribution by Channel
  • Website Conversion Rate
  • Direct Booking Ratio
  • Repeat Guest Ratio
  • Housekeeping Productivity Ratio
  • Employee Turnover Rate
  • Revenue per Employee (RPE)
  • Cost per Occupied Room (CPOR)
  • Cost per Available Room (CPAR)
  • Total Revenue by Property
  • Total Expenses by Property

OYO Statistics and Python Interview Questions

I had a couple of projects on Machine Learning, so a few questions were asked on statistics

What is Skewness? 

  • Skewness is a measure of the asymmetry of a distribution. This value can be positive or  negative. 
  • A negative skew indicates that the tail is on the left side of the distribution, which extends  towards more negative values. 
  • A positive skew indicates that the tail is on the right side of the distribution, which extends  towards more positive values. 
  • A value of zero indicates that there is no skewness in the distribution at all, meaning the  distribution is perfectly symmetrical. 

2. What is Kurtosis? 

Kurtosis is a measure of whether or not a distribution is heavy-tailed or light-tailed relative to  a normal distribution.

• The kurtosis of a normal distribution is 3. 

• If a given distribution has a kurtosis less than 3, it is said to be platykurtic, which means it  tends to produce fewer and less extreme outliers than the normal distribution. 

• If a given distribution has a kurtosis greater than 3, it is said to be leptokurtic, which means it  tends to produce more outliers than the normal distribution. 

3.How are covariance and correlation different from one another? 

Covariance measures how two variables are related to each other and how one would vary  with respect to changes in the other variable. If the value is positive, it means there is a direct  relationship between the variables and one would increase or decrease with an increase or decrease  in the base variable respectively, given that all other conditions remain constant. 

Correlation quantifies the relationship between two random variables and has only three specific  values, i.e., 1, 0, and -1. 

1 denotes a positive relationship, -1 denotes a negative relationship, and 0 denotes that the two  variables are independent of each other. 

4.What is Multicollinearity ? 

Multicollinearity occurs when two or more independent variables are highly correlated with  one another in a regression model. This means that an independent variable can be predicted from  another independent variable in a regression model.

5.What is VIF? 

Variance inflation factor (VIF) is a measure of the amount of multicollinearity in a set of  multiple regression variables. In general, a VIF above 5 indicates high correlation and is cause for  concern. Some authors suggest a more conservative level of 2.5 or above and it depends on the  situation. 

6.What is a confusion matrix and why do you need it? 

Confusion matrix is a table that is frequently used to illustrate the performance of a  classification model i.e., classifier on a set of test data for which the true values are well-known. It  allows us to visualize the performance of an algorithm/model. It allows us to easily identify the  confusion between different classes. It is used as a performance measure of a model/algorithm. It is  summary of predictions on a classification model. 

7.What do you mean when you say “Strings are immutable”?

Strings in Python are immutable i.e you can not change the defined string.

You can not change a part of the string, as it is immutable.

8.Are lists mutable ?
Lists are mutable i.e. you can change the values already present in the list.

9.Is dictionary zero indexed? Can we pull something like Team[0] from the above example?

The whole purpose of having a dictionary is  that you can have your own index i.e. key. So, to answer the question, Dictionary is not zero indexed.

You can not use the basic index thing example, you can not use Team[0] to pull the first value because you have already specified an index to all the values

10.What is the function range() ?

Range(10) will get you numbers from 0 to 9. But you need to put this range in some data type. Suppose you want to put this in a list.

There were a 2-3 more questions on Python, mostly around for loop and pattern printing.

The Data Monk services

We are well known for our interview books and have 70+ e-book across Amazon and The Data Monk e-shop page . Following are best-seller combo packs and services that we are providing as of now

  1. YouTube channel covering all the interview-related important topics in SQL, Python, MS Excel, Machine Learning Algorithm, Statistics, and Direct Interview Questions
    Link – The Data Monk Youtube Channel
  2. Website – ~2000 completed solved Interview questions in SQL, Python, ML, and Case Study
    Link – The Data Monk website
  3. E-book shop – We have 70+ e-books available on our website and 3 bundles covering 2000+ solved interview questions. Do check it out
    Link – The Data E-shop Page
  4. Instagram Page – It covers only Most asked Questions and concepts (100+ posts). We have 100+ most asked interview topics explained in simple terms
    Link – The Data Monk Instagram page
  5. Mock Interviews/Career Guidance/Mentorship/Resume Making
    Book a slot on Top Mate

The Data Monk e-books

We know that each domain requires a different type of preparation, so we have divided our books in the same way:

1. 2200 Interview Questions to become Full Stack Analytics Professional – 2200 Most Asked Interview Questions
2.Data Scientist and Machine Learning Engineer -> 23 e-books covering all the ML Algorithms Interview Questions
3. 30 Days Analytics Course – Most Asked Interview Questions from 30 crucial topics

You can check out all the other e-books on our e-shop page – Do not miss it


For any information related to courses or e-books, please send an email to nitinkamal132@gmail.com

MYNTRA Data Analyst INTERVIEW Questions


India’s leading e-commerce company, Myntra, is dedicated to ensuring the accessibility of fashion and lifestyle products to everyone. Solutions are created by us that disrupt the ordinary and contribute to making the world a happier and more fashionable place.

A company that is consistently evolving into newer and better forms, seeks individuals who are prepared to evolve with it. From its origins as a customization company in 2007 to being at the forefront of technology and fashion today, Myntra is going places, and it is encouraged for individuals to join this journey.

Skills Required

Apart from qualifications, the essential skills required for a Data Analyst position at Myntra include:

  1. Mastery of database fundamentals with a demonstrated ability to translate diverse business requirements into effective SQL queries.
  2. Proficient skills in Excel and PowerBI at an advanced level.
  3. Valued hands-on experience in R, Python, Tableau, Qlikview, and Data Studio, especially in roles related to customer growth or customer analytics.
  4. Demonstrated adaptability and the ability to excel in a dynamic and fast-paced work environment.
  5. A collaborative team player who is comfortable engaging with individuals from various professional backgrounds.

Interview Process

The Myntra interview process comprises the following stages:

  1. Application and Resume Screening: Applicants submit their online applications, and HR or recruiters review them to confirm qualifications and experience.
  2. Technical Assessment: Candidates undergo a technical assessment, which includes exercises in data analysis and SQL to assess their technical proficiency.
  3. Technical Interviews: Shortlisted candidates participate in technical interviews, where their experience, problem-solving skills, and proficiency in tools such as Excel, PowerBI, R, and Python are evaluated by experienced data professionals.
  4. Case Study/Scenario-Based Interviews: Some candidates are presented with a real-world data analysis problem or scenario. They are then asked to articulate their approach and methodology for solving it during the interview.
  5. Final Round Interviews: In certain instances, there may be a final round of interviews with senior team members or management to assess a candidate’s strategic thinking and alignment with the company’s goals.

Questions Asked

  1. Create the pivot table, sort the data in ascending order.
  2. Use lookup with the product based given data and find the needed data.
  3. Write SQL queries to perform operations such as joining, filtering, and aggregating data from multiple tables?
  4. Describe your approach to utilizing data analysis for resolving business challenges and how you communicate your findings through the use of data visualization tools.
  5. Which statistical methods and tools do you use in your data analysis practices?
  6. Case Study Question: How many cars are sold in your city in a month?

The Data Monk services

We are well known for our interview books and have 70+ e-book across Amazon and The Data Monk e-shop page . Following are best-seller combo packs and services that we are providing as of now

  1. YouTube channel covering all the interview-related important topics in SQL, Python, MS Excel, Machine Learning Algorithm, Statistics, and Direct Interview Questions
    Link – The Data Monk Youtube Channel
  2. Website – ~2000 completed solved Interview questions in SQL, Python, ML, and Case Study
    Link – The Data Monk website
  3. E-book shop – We have 70+ e-books available on our website and 3 bundles covering 2000+ solved interview questions. Do check it out
    Link – The Data E-shop Page
  4. Instagram Page – It covers only Most asked Questions and concepts (100+ posts). We have 100+ most asked interview topics explained in simple terms
    Link – The Data Monk Instagram page
  5. Mock Interviews/Career Guidance/Mentorship/Resume Making
    Book a slot on Top Mate

The Data Monk e-books

We know that each domain requires a different type of preparation, so we have divided our books in the same way:

1. 2200 Interview Questions to become Full Stack Analytics Professional – 2200 Most Asked Interview Questions
2.Data Scientist and Machine Learning Engineer -> 23 e-books covering all the ML Algorithms Interview Questions
3. 30 Days Analytics Course – Most Asked Interview Questions from 30 crucial topics

You can check out all the other e-books on our e-shop page – Do not miss it


For any information related to courses or e-books, please send an email to nitinkamal132@gmail.com

WALMART Data Analyst INTERVIEW Questions

Walmart stands as one of the world’s leading discount department store chains, boasting a global presence with thousands of stores that provide a diverse array of products at budget-friendly prices. The company offers competitive salaries, attractive incentives like stock options and 401(k) matching, and the opportunity to tackle intriguing business challenges. With Walmart’s strategic emphasis on boosting online sales while maintaining its commitment to affordable pricing, the demand for data analysts has surged. These professionals play a crucial role in optimizing pricing strategies, enhancing operations and supply chain efficiency, establishing robust data architecture, and monitoring key success metrics. In this comprehensive interview guide, we will navigate you through the Walmart data analyst interview process, explore important questions, and provide valuable tips to help you secure your ideal position with the retail giant.

Nature of Questions Asked in Walmart Data Analyst Interviews

Walmart Data Analyst interviews are tailored to assess a combination of problem-solving abilities, critical thinking skills, and proficiency in essential technologies such as SQL and reporting tools. Familiarity with machine learning, statistics, and coding in languages like Python or R is essential, and experience with big data technologies is considered advantageous.

It’s crucial to align your preparation with the specific role you’re applying for, whether it’s related to product analysis, risk assessment, or staff analytics. The advertised position may require expertise in building data architecture, analyzing user behavior, or managing information security. For example, if the role is within the transportation analytics team, understanding business operations and solving supply chain case study problems should be part of your interview preparation.

A valuable tip is to thoroughly read the job description, gaining insights into your daily responsibilities, the tools you’ll be using, and the specific business challenges the team aims to address. This understanding will guide your interview strategy effectively. Additionally, Walmart provides a helpful guide on their careers page to assist candidates in excelling during the interview process.

Data Analyst Interview Process

The Walmart Data Analyst interview process is structured to assess candidates’ technical proficiency, critical thinking skills, and alignment with the company culture. The key stages include:

  1. Preliminary Screening: Initiated by a recruiter, this step aims to understand the candidate’s background and potential fit for the role. It’s an opportunity for candidates to inquire about the position and strategically highlight their skills.
  2. Technical Interviews: Following the screening, candidates undergo technical rounds via phone or video calls. Questions may cover SQL, Excel, Tableau, and include behavioral and case study inquiries. The focus is on evaluating both technical competence and problem-solving abilities.
  3. Onsite Interview: Successful candidates from the technical interviews proceed to onsite interviews, typically with a panel from the intended team. This stage combines technical and behavioral questions, allowing the team to assess the candidate’s suitability for the specific role.

It’s important to note that while the overall interview process follows this general format, the questions asked are tailored to the specific role and team. The list of popular analyst questions provided below is derived from actual Walmart interviews and similar roles and companies. For additional preparation, candidates can explore a comprehensive collection of interview questions.

Behavioral Questions

During Walmart interviews, expect to encounter several behavioral questions designed to evaluate your soft skills, gauge your future performance, and assess your ability to collaborate and adapt to dynamic situations.

  1. What draws you to our organization and why do you want to work with us?
  2. Share an instance where you went above and beyond expectations in a project.
  3. Describe your approach to resolving conflicts within a team.
  4. How do you manage and prioritize multiple deadlines effectively?

SQL Interview Questions

SQL proficiency is a crucial requirement for the Walmart data analyst role, so thorough preparation for these questions is essential.

  • Create a SQL query to fetch the latest transaction for each day from a bank transactions table, which includes columns such as id, transaction_value, and created_at representing the date and time for each transaction. Ensure the output contains the ID of the transaction, the transaction datetime, and the transaction amount, with transactions ordered by datetime.
  • Develop a SQL query to assess user ordering patterns between their primary address and other addresses. Provide a solution based on tables containing transaction and user data.
  • As the accountant for a local grocery store, you’re assigned the responsibility of determining the cumulative sales amount for each product since its last restocking. Utilizing three tables – products, sales, and restocking – where products provide information about each item, sales document sales transactions, and restocking tracks restocking events, compose a SQL query to fetch the running total of sales for each product since its most recent restocking event.
  • Formulate a SQL query to pinpoint customers who conducted more than three transactions in both the years 2019 and 2020. Emphasize the logical condition: Customer transactions > 3 in 2019 AND Customer transactions > 3 in 2020.
  • Write a SQL query to retrieve neighborhoods with zero users based on two provided tables: one containing user demographic information, including the neighborhood they reside in, and another dedicated to neighborhoods. The goal is to identify and return all neighborhoods that currently have no users.

Coding Questions

  1. Explain the implementation of k-Means clustering using Python
  2. Provide a comprehensive guide on constructing a logistic regression model in Python.
  3. Describe the process of reconstructing a user’s flight journey.
  4. Create a function to extract high-value transactions from two provided dataframes: transactions and products. The transactions dataframe includes transaction IDs, product IDs, and the total amount of each product sold, while the product dataframe contains product IDs and corresponding prices. The objective is to generate a new dataframe containing transactions with a total value surpassing $100, and to include the calculated total value as a new column in the resulting dataframe.
  5. Describe the approach to identify the longest substring within a given string that exhibits maximal length.

Case Study Interview Questions

  1. Outline the process for forecasting revenue for the upcoming year.
  2. Describe the steps you would take to address the issue of underpricing for a product on an e-commerce site.
  3. Which key performance indicators (KPIs) would you monitor in a direct-to-consumer (D2C) e-commerce company?
  4. Outline the process of architecting end-to-end infrastructure for an e-commerce company.
  5. What approach would you take to identify the most profitable products for a Black Friday sale, optimizing for maximum profit?

Statistics and Probability Interview Questions

Walmart data analysts frequently engage in quantitative tasks such as statistical modeling, sampling, and extensive analysis of datasets, charts, and model metrics. Possessing robust quantitative skills, especially in statistics and probability, is crucial for excelling in these responsibilities.

  1. Walmart aims to assess customer satisfaction with a recently introduced in-store service. Outline your approach to crafting a survey that ensures a representative sample of customers. Additionally, explain the choice of sampling techniques and their rationale.
  2. What is the drawback of the R-squared (R^2) method when analyzing the fit of a model that aims to establish a relationship between two variables. Discuss the limitations of the R-squared metric, situations in which it is appropriate, and propose alternative strategies. Support your response with examples.
  3. Walmart is interested in examining whether there is a substantial disparity in customer spending between weekdays and weekends. Describe the statistical test you would employ for this analysis and elucidate your approach to interpreting the outcomes.
  4. Outline strategies to minimize the margin of error in a study with an initial sample size of n, where the current margin of error is 3. If the goal is to reduce the margin of error to 0.3, discuss the additional samples required for this reduction. Emphasize the importance of seeking clarifications about the business context and explicitly state any assumptions made, as deviations can impact the margin of error.
  5. Elaborate on the distinctions between a normal distribution and a binomial distribution. Offer instances where each distribution is relevant within a retail context, illustrating their applicability.

The Data Monk services

We are well known for our interview books and have 70+ e-book across Amazon and The Data Monk e-shop page . Following are best-seller combo packs and services that we are providing as of now

  1. YouTube channel covering all the interview-related important topics in SQL, Python, MS Excel, Machine Learning Algorithm, Statistics, and Direct Interview Questions
    Link – The Data Monk Youtube Channel
  2. Website – ~2000 completed solved Interview questions in SQL, Python, ML, and Case Study
    Link – The Data Monk website
  3. E-book shop – We have 70+ e-books available on our website and 3 bundles covering 2000+ solved interview questions. Do check it out
    Link – The Data E-shop Page
  4. Instagram Page – It covers only Most asked Questions and concepts (100+ posts). We have 100+ most asked interview topics explained in simple terms
    Link – The Data Monk Instagram page
  5. Mock Interviews/Career Guidance/Mentorship/Resume Making
    Book a slot on Top Mate

The Data Monk e-books

We know that each domain requires a different type of preparation, so we have divided our books in the same way:

1. 2200 Interview Questions to become Full Stack Analytics Professional – 2200 Most Asked Interview Questions
2.Data Scientist and Machine Learning Engineer -> 23 e-books covering all the ML Algorithms Interview Questions
3. 30 Days Analytics Course – Most Asked Interview Questions from 30 crucial topics

You can check out all the other e-books on our e-shop page – Do not miss it


For any information related to courses or e-books, please send an email to nitinkamal132@gmail.com

Guesstimate 9 : The number of people wearing watches in Bangalore

Ans:      Let’s assume population of Bangalore as 10 million and the day today is a working day for every age group

And age group wise population assumption would be

0–15 yrs: 30%; 15–25 yrs: 20%; 25–50 yrs: 30%; >50 yrs: 20%

Income wise population would be

Above poverty line: 75%

The people below poverty has negligible chance to use a watch, so I am eliminating this 25% population from every age group.

Now we have 75% of aforesaid % in every age group. As below poverty is eliminated 0–15 group mostly have school children and infants I assume 5% of them wearing watches today. So the count would be 2.25 million*5%=112500

15–25 age group have majority of higher education students who spend most of their time using mobile phones, so people using watches may be nearly 25% which gives 1.5 million*25%= 375000

25–50 groups have professionals working somewhere for their survival whose generation recently entered the smartphone culture. So most of them use watches. If out of this group, if 90% are couples, we will approximately have 20% housewives in it. As there is less chance of housewives wearing watches we can neglect it. The remaining 80% population say 90% wear watches, then count would be (80%*2.25 million)*90%=1620000

Coming to >50 group say 25% are working somewhere having 90% of them wearing watches again and in remaining 75% let 20% wear watches as we see many of our grandfathers wearing watches even in house. So count would be (25%*1.5 million)*90%+(75%*1.5 million)*20%= 562500

Total= sum of all these
=2.67 million wear watches

Guesstimate 8 : How many red colour Swift cars are there in Delhi ?

Let’s start with the population of Delhi which is 2 Crores.

We will divide this population into two groups-

1. Family(80%) = 0.8*20000000 = 16000000 family members

2. Bachelors(Individuals) (20%) = 0.2*20000000 = 4000000

Number of families(assuming 4 members in each) = 16000000/4 = 4000000.

Guessing that 50% of families have cars, so the number of families with cars = 2000000.

In Delhi, we can assume that 25% of the families belong to high class society so they can afford 2 cars on an average and the rest can afford only one car.

Therefore number of cars with families 
= 0.25*2000000*2 + 0.75*2000000*1
= 2500000.

Now let’s say only 10% of the individual population can afford a single car.

Therefore, number of cars with individuals 
= 0.10*4000000
= 400000.
So the total number of cars in Delhi can be estimated as 
2500000+400000 = 2900000
which can be rounded of to 3000000 for simple calculations.

Since Maruti being the Indian market leaders in car sector so we can safely assume 50% of cars on the roads of Delhi are of Maruti, i.e., 1500000 cars.

Swift is one of the most common and affordable models along with Alto, WagonR, Omni and 800. So let’s assume there are 200000 Swifts.

White, silver, grey, black are the most common colors. So
75%(approx) of Swifts will be of these colors.

Now we are left with 50000 Swifts of different colors. Considering red, yellow, blue, maroon and orange as possible other colors,we can guess that there are nearly 10000 red Swifts in Delhi.

Keep Learning 🙂

The Data Monk

Guesstimate 7 : How to estimate the number of ambulances on the road ?

Ans: Let’s start with the population of the country ~ 1.3 Billion (1300 million)

Rural – 70% = 900 million Urban 400 million

Let’s divide the Urban population into three groups based on income level – Low |High |Upper High

I would divide as : Low: 30% High - 50% Upper High - 20%
So Urban Low: 120 million High - 200 Upper High - 80 million
Now out of Urban Low I would assume 10% have driving as occupation (Rise of taxi services etc. (only considering 4 wheeler drivers))
= 12 million (120,00,000)

Out of these I would assume 1–1.5% are ambulance drivers = 120000

So we can say almost 1,20,000 ambulances are available in the country at any time.

Now generally ambulances are available on calling, that means, at least a half of them are always on backup. let’s assume that 70% of them are on idle at any given point of time.

Also since their average journey time is of 20–25 minutes,
so we can say not more than 10% of ambulances would be on road at any time

So we can say 10% are on roads at any time = 12000 ambulances on road across Urban India.

Guesstimate 6 : How do we estimate the area of an Airport ?

Ans: Let’s assume the airport can accommodate 10 planes at once and handle 5 runways.

Average length of runway = 2000m

Width of runway = 50 m

The total area of runway is 2000*50*5 = 500000 sq.m

Length of average plane is assumed as 50m and its wingspan is 40 m.

Total area of 10 planes = 20000+500000 = 520000 sq.m

Now somes assumptions, total area occupied by the building is equal to the area of runways and planes i.e 520000 sq.m

Total area of airport = 104000 sq.m.

There is a lot of empty space on an airport so we assume 40% of the total hence the final area is = 104000/0.6 = 1700000 sq.m.

You can think of different methods to calculate the same, this is just one of the quick way ti estimate the area of an airport

Guesstimate 5 : How much is the surf excel detergent usage in a day in India?

Approach:  
India has a population of approx 1.2B People.

About 20% are BPL and would therefore not use surf excel. Remaining population: o.8*1.2B = 0.96B people.

Assuming a family of 4 people that is 0.24B families.

Rural:Urban = 30:70 (0.072B:0.168B)

Assuming only about 10% of people use surf excel in ruler areas, due to the availability of other cheaper mediums that will be 16M Families.

Due to competition and availability of substitutes in urban areas, assuming surf excel has a market share of 40%, that will be about 28M families.

Total user base: 44M Families.

Everyday usage must be at least 10 grams, total usage = 440 Million gms of surf excel everyday.

Guesstimate 4 : Estimate the total length of roads in your city

Approach:

If we take Mumbai and say Blue Dart, it will take 1 delivery truck for a region like Andheri from the regional distribution centre.

Andheri is assumed as approximately 25km^2 in area.

A delivery truck driver would work for 7 hours a day driving, at a speed of 30–40 kmph (Andheri being congested).

So kilometers he clocks in a day: 30*7 = 210 km.

Now generally the delivery schedule is planned in a such a way that all deliveries happen in the most efficient way. (Operations Guys at the firm would be getting paid lakhs for ensuring this shit!, but then technology rules! )

So assuming up and down journey, I would assume length of the roadways in Andheri area as 210/2 = 105 km.
But provisioning for some scope of redundancy during the travel, I would take 70% uniqueness factor.
That gives us ~ 70km
Also some of the areas would be covered by bike delivery
guys: say 30% of what truck covers (as bikers cover short distances) = ~20km
Also 10–15 km for postman/walker delivery personnel : ~
10km
So total road length for Andheri ~ 100km
Now, Mumbai city area: 600km^2
So Andheri like region = 600km^2 / 25 = 24
Assuming a coverage ratio of the courier service to be 90%
Regions covered: ~22
So total road length ~ 22*100 = 2200km.

Keep Learning 🙂
The Data Monk