Flipkart Interview Question | What are the basic checks you do for cleaning the data?

Question

.

solved 0
TheDataMonk 55 years 4 Answers 2723 views Grand Master 0

Answers ( 4 )

  1. Before cleaning the data there are certain checks that one should do.
    1.We should first see the shape of the data i.e. the number of rows and number of columns the dataframe has.
    Code for the above (df.shape)
    2. We should check the info of the dataframe which tells us about the datatype of the features in the dataframe.
    Code for the above(df.info)
    3. We should also check the 5 point summary of the dataframe.
    Code for the above (df.describe)
    4. We can also check for skewness,kurtosis and outliers as well.

  2. 1) Missing values per column
    2) Presence of outliers and the reason for their existence.
    3) Treatment for missing values.
    4) Converting datatypes of certain features.
    5) Merging and restructuring datasets.
    6) Performing one–hot encoding if required.

    Best answer
  3. Basic checks:
    df.head()
    df.shape()
    df,describe()
    df.info()
    df.isnull().sum()

  4. 1) missing values per column
    2) outliers
    3) standardisation issues
    4) impractical values
    5) incorrect datatypes

Leave an answer

Browse
Browse