BookMyShow Data Scientist Interview Question

Company Name – Bookmyshow
Location –
Bangalore
Position –
Data Scientist


Number of Rounds – 4 
Round 1 –
Written Open book SQL and R/Python round
Round 2 –
Case Study
Round 3 –
Statistics and Project Discussion
Round 4 –
HR Round

Round 1 – SQL and R open book written test

1. Count the total salary department number wise where more than 2 employees exist.
2. How can I retrieve all records of emp1 those should not present in emp2?
3. How to fetch only common records from two tables emp and emp1?
4. How to get nth max salaries?
5. How to get 3 Min salaries?
6. Select all customers who purchased at least two items on two separate days.
7. Given a table with a combination of flight paths, how would you identify unique flights if you don’t care which city is the destination or arrival location.
8. If you have two SQL database tables that are not joined  together, how would you create another table to join them.
9. There were plotting questions in R, normal syntax of ggplot in R and seaborn package in Python to create countplot and catplot. Go through the visualizations in R/Python


Round 2 – Case Study

Case Study 1 – A client has a Diwali-themed e-commerce shop that sells five items. What are some potential problems you foresee with their revenue streams?

Case Study 2 – Taj Group of Hotels is planning to start a new branch, What are the parameters it should consider to find the appropriate place?

Round 3 – Statistics and Project Discussion
My project was on Natural Language Processing, so the questions were mostly around the same topic.

a. Give an example of Normal Distribution from daily life.
b. Why do we have N-1 as the denominator when calculating sample variance and N when calculating population variance?
c. How do you remove your own list of stop words from a line of text given below
‘Book My Show is the best website to book a show’
d. What is the difference between stemming and lemmatization?
e. What were the packages which you used in this project?
f. Suppose there is a column in a text file with lots of text and you have take only words and exclude special characters and number.
g. What are the steps involved in a typical Text-Analytics project
h. How many bi-grams can be generated from given sentence:
“Sachin Tendulkar is the best batsman in the World”
i. What Is The Significance Of Tf-idf?
j. What is Normalization in text or text normalization?
k. What kind of features can be followed by NLP for improving accuracy in the classification model?
l. How does a Sentiment analysis algorithm about customer review works?
m. Then how do you counter sarcasm?

Round 4 – HR Round
Basic HR Questions

This was it 

Amazon Interview Question
Sapient Interview Questions

Full interview question of these round is present in our book What do they ask in Top Data Science Interview Part 2: Amazon, Accenture, Sapient, Deloitte, and BookMyShow  

You can get your hand on our ebooks 

1. The Monk who knew Linear Regression (Python): Understand, Learn and Crack Data Science Interview
2. 100 Python Questions to crack Data Science/Analyst Interview
3. Complete Linear Regression and ARIMA Forecasting project using R
4. 100 Hadoop Questions to crack data science interview: Hadoop Cheat Sheet
5. 100 Questions to Crack Data Science Interview
6. 100 Puzzles and Case Studies To Crack Data Science Interview
7. 100 Questions To Crack Big Data Interview
8. 100 Questions to Learn R in 6 Hours
9. Complete Analytical Project before Data Science interview
10. 112 Questions To Crack Business Analyst Interview Using SQL
11. 100 Questions To Crack Business Analyst Interview
12. A to Z of Machine Learning in 6 hours
13. In 2 Hours Create your first Azure ML in 23 Steps
14. How to Start A Career in Business Analysis
15. Web Analytics – The Way we do it
16. Write better SQL queries + SQL Interview Questions
17. How To Start a Career in Data Science
18. Top Interview Questions And All About Adobe Analytics
19. Business Analyst and MBA Aspirant’s Complete Guide to Case Study – Case Study Cheatsheet
20. 125 Must have Python questions before Data Science interview
21. 100 Questions To Understand Natural Language Processing in Python
22. 100 Questions to master forecasting in R: Learn Linear Regression, ARIMA, and ARIMAX
23. What do they ask in Top Data Science Interviews
24. What do they ask in Top Data Science Interviews: Part 1

Keep Learning 🙂







Amazon Interview Questions

Company Name – Amazon
Location – Bangalore
Position – Business Analyst

Number of Rounds –  3
Round 1 – Telephonic Round on Technical Capabilities
Round 2 – SQL, Excel, and Statistics
Round 3 – Project discussion and HR


The Hiring Manager will ask about your technical proficiency in the different tools and technologies. Once you met the requirement of their team then you will be asked further questions.
My proficiency was in

SQL – 8/10
Tableau – 8/10
Python – 9/10
Statistics – 8/10

I was asked questions mainly on SQL, which included that of Joins, Group By, etc.
Following are the few questions:-

1. What is the difference between HAVING and WHERE condition?
2. How to calculate Mode, Median, and Mean from a given number?
3. What is the relation between Mean, mode, and median in a normal distribution?
4. What percentage of value lies between Mean and one Standard deviation(both positive and negative)
5. Give the relation of Mean, median, and mode in a positively skewed distribution

6. Negatively skewed distribution?
7. What is sum of squared deviation?
8. Why do we need to square the terms?
9. What is the order of execution of a SQL query?
10.How to find Third highest salary in Employee table using self-join?
11.Why data cleaning plays a vital role in analysis?
12. Classification vs regression?

13. What are the predictor and target variable?
14. How does knn works?

Other than these, there were simple questions on the use of Order by, Between, Like commands

Round 2 – SQL, Excel, and Statistics

1. What is the difference among COUNT, COUNTA, COUNTIF and COUNTBLANK in Ms-Excel?
2. What is the order of sequence of operating mathematical operation in Excel?
3. What is the syntax of Vlookup? How does VLookup works?
4. What is the difference between heap table and temporary table?
5. What is the usage of regular expressions in MySQL?
6. What is the difference between primary key and candidate key?
7. What are the absolute measures of dispersion?

8. What are the measures of spread?
9. What is the use of Kurtosis?
10. What are Leptokurtic and Platykurtic?

You can find the SQL questions here

You can find Statistics Interview Questions here

Round 3 – Project Discussion and HR

The last round was a mixture of Project Discussion and HR

My project was on analyzing the performance of store and product for a leading Multinational Retailer. The performance of a few category of clothes were below par and some stores were not performing good with respect to revenue. We were analyzing the worst performing product and stores.

There were few products which were not performing well in some of the stores, whereas these were performing good in other stores.

Few questions asked in the interview were?
1. What is shelf-life?
2. What is supply chain analytics?
3. Project Description

4. What was the final recommendation?
5. Suppose you were the owner of the company, will you directly shut these stores and products? Is it fair?

6. What is footfall?
7. If you have to recommend a product to a customer who has already filled his cart, then what data will you look for? Basically, how will you recommend a product to an e-commerce customer?

This was it 🙂

Sapient Interview Questions

Full interview question of these round is present in our book What do they ask in Top Data Science Interview Part 2: Amazon, Accenture, Sapient, Deloitte, and BookMyShow 

You can get your hand on our ebooks

1. The Monk who knew Linear Regression (Python): Understand, Learn and Crack Data Science Interview
2. 100 Python Questions to crack Data Science/Analyst Interview
3. Complete Linear Regression and ARIMA Forecasting project using R
4. 100 Hadoop Questions to crack data science interview: Hadoop Cheat Sheet
5. 100 Questions to Crack Data Science Interview
6. 100 Puzzles and Case Studies To Crack Data Science Interview
7. 100 Questions To Crack Big Data Interview
8. 100 Questions to Learn R in 6 Hours
9. Complete Analytical Project before Data Science interview
10. 112 Questions To Crack Business Analyst Interview Using SQL
11. 100 Questions To Crack Business Analyst Interview
12. A to Z of Machine Learning in 6 hours
13. In 2 Hours Create your first Azure ML in 23 Steps
14. How to Start A Career in Business Analysis
15. Web Analytics – The Way we do it
16. Write better SQL queries + SQL Interview Questions
17. How To Start a Career in Data Science
18. Top Interview Questions And All About Adobe Analytics
19. Business Analyst and MBA Aspirant’s Complete Guide to Case Study – Case Study Cheatsheet
20. 125 Must have Python questions before Data Science interview
21. 100 Questions To Understand Natural Language Processing in Python
22. 100 Questions to master forecasting in R: Learn Linear Regression, ARIMA, and ARIMAX
23. What do they ask in Top Data Science Interviews
24. What do they ask in Top Data Science Interviews: Part 1

Keep Learning 🙂

The Data Monk services

We are well known for our interview books and have 70+ e-book across Amazon and The Data Monk e-shop page . Following are best-seller combo packs and services that we are providing as of now

  1. YouTube channel covering all the interview-related important topics in SQL, Python, MS Excel, Machine Learning Algorithm, Statistics, and Direct Interview Questions
    Link – The Data Monk Youtube Channel
  2. Website – ~2000 completed solved Interview questions in SQL, Python, ML, and Case Study
    Link – The Data Monk website
  3. E-book shop – We have 70+ e-books available on our website and 3 bundles covering 2000+ solved interview questions. Do check it out
    Link – The Data E-shop Page
  4. Instagram Page – It covers only Most asked Questions and concepts (100+ posts). We have 100+ most asked interview topics explained in simple terms
    Link – The Data Monk Instagram page
  5. Mock Interviews/Career Guidance/Mentorship/Resume Making
    Book a slot on Top Mate

The Data Monk e-books

We know that each domain requires a different type of preparation, so we have divided our books in the same way:

1. 2200 Interview Questions to become Full Stack Analytics Professional – 2200 Most Asked Interview Questions
2.Data Scientist and Machine Learning Engineer -> 23 e-books covering all the ML Algorithms Interview Questions
3. 30 Days Analytics Course – Most Asked Interview Questions from 30 crucial topics

You can check out all the other e-books on our e-shop page – Do not miss it


For any information related to courses or e-books, please send an email to nitinkamal132@gmail.com

Sapient Interview Question


Company Name – Sapient
Location –
Bangalore
Position – Assistant Manager Analytics
Salary – 12.5 LPA

Number of Rounds – 4 
Round 1 –
Telephonic (Python)
Round 2 –
Case Study and Guesstimate
Round 3 –
Project Discussion
Round 4 –
HR Round

Round 1 – Telephonic Round on Python

Questions were mostly around basic SQL syntax, RANKING function, and Query Optimization. Following were the questions

1. There is a column named as Title which contains two variable Mr. and Mrs.. Create a new column by replacing it with 0 for Mr. and 1 for Mrs.

2. Now use this column to count frequency of each title in the data set

3. Plot the same in a histogram or bar graph

4. Write a function to extract the title from Singh, Mr. Rahul

5. Name few functions which you use to find the overall dimension, summary, etc. of the data set.

6. How to get the number of unique values from a column. Say the column contains an alphabet for the associated tag

7. There are mainly three types of titles i.e. Mr, Miss, and Mrs. Take the column and put all the other titles in “Others” variable. This will contain Dr, Col, Sir, and Lady

8. A table has a column on Salary. This column contains few NAs, how to replace it with median?

9. Now there is one more column, Tag,  which holds the following values – A, S, D, F, and NAs. Replace the NAs with the maximum occurring value

10. Now you want to fill a column with a particular value, say ‘X’ in the NAs present in the tag

Round 2 – Case Study and Guesstimate

Guesstimate – Number of Maggi sold in a day in India
You can write your approach in the comment section. If you don’t know how to make a guesstimate, then here is a link
Guesstimate

Case Study

Topic –
What would you prefer? A company which makes money or a company which serves humanity but makes very less money
You can write your approach in the comment section. Need help in Case Studies? Try the link below
Case Study

Round 3 – Project Discussion

My project was on Logistic Regression where I was supposed to decide which customer is likely to in-cash his/her insurance based on 50+ attributes of the customer.
Full interview question of this round is present in our book What do they ask in Top Data Science Interview Part 2: Amazon, Accenture, Sapient, Deloitte, and BookMyShow

Round 4 – HR Round

Not much, there were few questions on the interview process, the team you have previously worked with, the type of work environment in the previous organization, etc.

I got a call from the HR after 5 days about Salary negotiation

The complete interview with answer to each question is given in our ebook published on Amazon
What do they ask in Top Data Science Interview Part 2: Amazon, Accenture, Sapient, Deloitte, and BookMyShow

Keep Learning