Nykaa Data Analyst Interview Questions | Day 9

Name – Nykaa
Designation – Senior Data Analyst
Location – Gurgaon
Salary – 22 LPA (including 10% variable)
Level of questions – 7/10
For the Senior Data Analyst position there were 4 rounds:

Round 1 – Technical Screening (SQL heavy)
Round 2 – Project, Case Study, and SQL
Round 3 – SQL, Python, and Guesstimate/Case Study
Round 4 – Cultural fit with the Hiring manager

Below are some of the questions and analogous concepts asked in the complete recruitment process, the candidate had some experience in the Natural Language Processing domain, so he was asked a few questions on that front:

  1. What is the use of the NVL function in Oracle?
    NVL function is the most important function to replace a null value with another value.
    select NVL(null,’ Amit’) from dual;
    which will give you output as Amit.
  2. What is the result of the following query?
    case when null=null then ‘Amit’ else ‘Rahul’ end as Case_check
    from Table_Name;

    The null=null is always false. So the Answer to this query is Rahul.
  3. What is a parser?

    When SQL Statement has been written and generated the first step is parsing that SQL Statement. Parsing is nothing but checking the syntaxes of SQL queries. All the syntax of Query is correct or not is checked by SQL Parser.
    There are 2 functions of the parser:
    1. Syntax analysis
    2. Semantic analysis
  4. What is lapply and sapply?

    Lapply applies a function to each element of a list and returns the results as a list Sapply applies a function to each element of a list and returns the result in a vector.
  5. Guesstimate – What is the size of the market for disposable diapers in India?

    1.2 billion people x 60% childbearing age = 0.72 B people
    0.72 people x 1/2 are women = 0.36 B women of childbearing age 0.36 women x 2/3 have children = 0.24 women with children
    0.24 women x 1.5 children each = 0.36 children
    0.36 B children x 1/10 under age 2 = 36 million
  6. Count the total salary department number-wise where more than 2 employees exist.

    SELECT deptno, sum(sal) As totalsal
    FROM emp
    GROUP BY deptno
    HAVING COUNT(empno) > 2
  7. How to retrieve the 3 Minimum salaries ?

    FROM emp a
    WHERE 3 >= (SELECT COUNT(DISTINCT sal) FROM emp b WHERE a.sal >= b.sal);
  8. Case Study 1 – A client has a Diwali-themed e-commerce shop that sells five items. What are some potential problems you foresee with their revenue streams?

    a. The immediate issue with the client’s revenue stream is that it will take a severe hit once the holiday season is over.
    b. How to generate revenue outside of the holiday season would be a key point to address with the client.
    c. The other concern is with only offering five items.
    d. The client is severely limiting their opportunity to generate revenue
    e. A couple of bad reviews might create a lot of problems for them as they have very limited items
    f. These products are mostly around lighting and crackers, these products have brief shelf-life and the defect in the product is also more than usual
    g. Competitor issue – Since these are themed product that are released once an year, so a competitor might provide a sub-standard product at lower cost to kill the competition
  9. How do you remove your own list of stop words from a line of text given below ‘Book My Show is the best website to book a show’

    dict = [“is”,”the”,”and”,”are”,”you”,”to”,”here”,”this”,”we”,”This”,”a”,”best”]
    def stopy(text):
    words = text.split()
    no_noise = [word for word in words if word not in dict]
    final = ” “.join(no_noise)
    return final

    x = stopy(“Book My Show is the best website to book a show”)
  10. What are the steps involved in a typical Text-Analytics project

    We mostly follow the below steps:-
    -Get the raw data
    -Remove special characters and punctuations after converting the text into tokens
    -Remove stop words. These are the common words which are present in text
    -Stemming and Lemmatization to remove the noise from the filtered data
    -Do a TF-IDF to find out the important words
    -We mostly go for n-gram to see the correlated words
    -Word correlation

    – After this point, it’s mostly about the requirement of the project. There are multiple algorithms that we followed at different points in time
    *Part of Speech Tagging
    *Named Entity Recognition
    *Text Classification
    *Sentiment Analysis

    -How many bi-grams can be generated from a given sentence:
    “Sachin Tendulkar is the best batsman in the World”
    Sachin Tendulkar, Tendulkar is, is the, the best, best batsman, batsman in, in the, the World

