Register Now

Login

Lost Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Dunzo Data Science Interview: Most Asked Questions and Expert Tips

We know that each domain requires a different type of preparation, so we have divided our books in the same way:

Our best seller:
✅Become a Full Stack Analytics Professional with The Data Monk’s master e-book with 2200+ interview questions covering 23 topics – 2200 Most Asked Interview Questions

Machine Learning e-book
Data Scientist and Machine Learning Engineer ->23 e-books covering all the ML Algorithms Interview Questions

Domain wise interview e-books
Data Analyst and Product Analyst Interview Preparation ->1100+ Most Asked Interview Questions
Business Analyst Interview Preparation ->1250+ Most Asked Interview Questions

We are a group of 30+ people with ~8 years of Analytics experience in product-based companies. We take interviews on a daily basis for our organization and we very well know what is asked in the interviews.
Other skill enhancer websites charge 2lakh+ GST for courses ranging from 10 to 15 months.

We only focus on making you a clear interview with ease. We have released our Become a Full Stack Analytics Professional for anyone in 2nd year of graduation to 8-10 YOE. This book contains 23 topics and each topic is divided into 50/100/200/250 questions and answers. Pick the book and read it thrice, learn it, and appear in the interview.

Company: Dunzo
Designation: Data Scientist
Year of Experience Required: Not Mentioned
Technical Expertise: SQL, Python/R, Statistics, Machine Learning, Case Studies
Number of Rounds: 5

Dunzo is a leading Indian company providing hyperlocal delivery services across major cities like Bengaluru, Delhi, Gurugram, Pune, Chennai, Jaipur, Mumbai, and Hyderabad. Founded in 2015 and headquartered in Bengaluru, Dunzo has gained significant traction, including funding from Google in 2017. The company also operates a bike taxi service in Gurugram.

Dunzo Data Science Interview Question

1) What is the difference between VARCHAR and CHAR data types in MySQL?

2) What is the purpose of the PRIMARY KEY constraint in MySQL?

3) What is the difference between DELETE and TRUNCATE statements in MySQL?

4) What is the purpose of the WHERE clause in a SELECT statement?

5) What is the purpose of the JOIN clause in MySQL?

1) Write a Python list comprehension that creates a list containing the squares of all even numbers between 1 and 10 (inclusive).

2) Write a Python function that takes two dictionaries as input and returns a new dictionary that contains all the key-value pairs from both dictionaries. If there are duplicate keys, the values from the second dictionary should take precedence.

3) Write a Python function that reads a text file and returns the number of lines in the file.

4) Given the variables name = "Alice" and age = 30, create a formatted string that says “Alice is 30 years old.” using f-strings.

5) Write a Python function that divides two numbers and handles the ZeroDivisionError if the second number is zero.

1) Is it better to spend 5 days developing a 90% accurate solution, or 10 days for 100% accuracy? Which one would you prefer?

It depends on the context of the problem. If the solution is time-sensitive and the 10% inaccuracy has minimal impact, a 90% accurate solution in 5 days is preferable. However, if the problem is critical (e.g., healthcare or finance), investing 10 days for 100% accuracy is justified. In most business scenarios, a 90% solution that can be iteratively improved is often the better choice.

2) How do data management procedures like missing data handling make selection bias worse?

Missing data handling techniques like listwise deletion or mean imputation can introduce selection bias if the missing data is not random. For example, if data is missing for a specific subgroup (e.g., high-income customers), removing or imputing this data can skew the analysis and lead to incorrect conclusions. To mitigate this, use techniques like Multiple Imputation or analyze the missing data pattern before handling it.

3) When does regularization become necessary in Machine Learning? Explain with example situations.

Regularization is necessary when a model is overfitting, i.e., performing well on training data but poorly on unseen data. For example:

In linear regression, adding L1 (Lasso) or L2 (Ridge) regularization helps reduce overfitting by penalizing large coefficients.

In deep learning, dropout regularization is used to prevent overfitting in neural networks.

4) How to optimize a web crawler to run much faster, extract better information, and better summarize data to produce cleaner databases?

Faster Execution: Use asynchronous programming or parallel processing to handle multiple requests simultaneously.

Better Information Extraction: Implement advanced parsing techniques (e.g., regex, XPath) and machine learning models to identify relevant data.

Cleaner Databases: Use data validation rules, deduplication, and normalization techniques to ensure data quality.

5) You are about to send one million emails (marketing campaign). How do you optimise delivery? How do you optimise response?

Optimize Delivery: Use a reliable email service provider (ESP), segment your email list, and ensure your emails comply with anti-spam regulations.

Optimize Response: Personalize emails, use A/B testing to refine subject lines and content, and include clear calls-to-action (CTAs).

Dunzo wants to improve its delivery time prediction model to provide customers with more accurate estimated delivery times (ETAs). Your task as a data scientist is to analyze delivery data, identify key factors affecting delays, and suggest strategies to enhance delivery time accuracy.

You have access to a dataset containing past delivery records. The dataset includes the following attributes:

1. What are the main factors influencing delivery delays?

2. How can we improve ETA predictions?

3. How can Dunzo optimize its delivery operations?

1. Identifying and Addressing Delay Causes

2. Enhancing ETA Predictions Using Real-Time Data

3. Optimizing Delivery Partner Assignments

4. Improving Customer Communication and Satisfaction

 Basic, you can practice a lot of case studies and other statistics topics here –
https://thedatamonk.com/data-science-resources/

For any information related to courses or e-books, please send an email to [email protected]

About TheDataMonkGrand Master

I am the Co-Founder of The Data Monk. I have a total of 6+ years of analytics experience 3+ years at Mu Sigma 2 years at OYO 1 year and counting at The Data Monk I am an active trader and a logically sarcastic idiot :)

Follow Me