Moonfrog Labs Data Science Interview Question
Moonfrog Labs is a Bangalore based start-up that makes mobile-first games for masses. It was established in 2013 with
Location – Bangalore
Job Title – Data Scientist
Experience Required – 2-3 Years
Number of Rounds – 4
This was one of those companies where the interview was to the point and statistics-heavy.
Round 1 – Open book SQL and/or R test
There were 10 SQL questions in 30 minutes and a separate section of R which was not mandatory to attempt for the role but added as an advantage in the further rounds.
Consider the following database schema for an e-commerce website
Event table (no
event_name [VARCHAR],
user_id [VARCHAR],
time [TIMESTAMP]Transaction table (no index / key) – contains transaction data for item_id [VARCHAR],
quantity [INTEGER],
price [NUMERIC],
user_id [VARCHAR],
time [TIMESTAMP]
Provide detailed queries to retrieve the answers to the following questions:-
-Average time between first visit and first purchase across users. [consider event name ‘visit’]
-Median and 80 percentile time between first visit and first purchase across users. [consider event name ‘visit’]
–
– Median and 80 percentile time between first visit and purchase item per item_id.
– Given an item_id [consider item_id ‘shoe_01’], retrieve an ordered list of items that are most likely to be purchased by a user who purchased the item.
There were 4 questions on R with the following topics:-
1.
2. Linear Regression implementation
3 and 4 – Table and Question below
Round 2 – Case Study
The case study round was more about solving any
“How will you change the UI of the game to increase the number of people buying coins from the shop section”
This was a business case study, we had the discussion on the following points:-
1. Project the offer directly on the home page instead of clicking on the shop button
2. Put the price to buy elixir and gold near the collector(the pink and yellow images)
3. After every attack resulting in a loss, give an option to buy back the points
4. Get the country-wise data and see the most engaging time of the player and project the offers accordingly
5. Give a variable discount to the players on the basis of their day-to-day performance to allure them in buying coins on the day they performed well
It was a non-elimination and open-ended discussion. The following points were forwarded to the next round which received good feedback from the interviewer. You can think of many more questions
Round 3 – Technical Interview
This was a tough round, it lasted for 45 minutes and was heavily inclined towards conceptual statistics questions and confusing SQL queries. Following are the questions asked(memory based)
1. Explain Normal Distribution
2. Give a
3. What are the tests you are familiar with in statistics?
4. What is p-value?
5. How to generate row number in a table without using ROWNUM()
SELECT name, sal, (SELECT COUNT(*) FROM EMPLOYEE i WHERE o.name >= i.name) row_num FROM EMPLOYEE o order by row_num
6. What are the differences among ROWNUM, RANK
7. What is the probability of getting 2 numbers whose products are even when two dices were thrown?
Ans – 3/4 (Take all the cases of getting an even number using two numbers)
8. What is the probability of getting all the three cards red, when there are 5 black, 6 red and 7 blue cards?
9. How to increase the marketing of online games?
A. Always start with the most obvious points and then move towards the unorthodox solution. We started the discussion with digital marketing, moved to offline marketing and then discounts to different categories of players, etc.
Round 4 – HR and Leadership interview
Even this was a tough round to crack, there were people getting rejected after 2 hours of intensive last round. It was mostly around the mistakes you did in all the three previous rounds and some sound questions on Supervised Learning using either R or Python. You can go through the most asked Supervised Learning questions, link below
Supervised Learning interview questions
The HR round was cool, nothing fancy 🙂
Comment ( 1 )
How did you solve the Visualisation problem using R?