Data needs to be transformed in a format which machines can make sense of and
we can get the best out of the ML models. For example, when building a Linear model
it is necessary to perform one-hot encoding to convert the Categorical variables into features, so
that the ML can model can make sense out of it.
Also, other techniques like treating missing values, dimensionality reduction etc. help to get
our data in a better shape for applying ML algorithms. So, yes, Data Cleaning is important.
Data Cleaning plays a vital role in building a model. Converting the data into machine understandable language is imperative for training a model to deter an accurate result. Changing the data types of features according to the demand of the model. Handling of missing values in the data, Treating categorical variables using Ordinal Encoder and Numerical variables using Scaling or Normalising the data using Scalar encoder or Normalisation method.
Importance of Data cleaning-
ML models can understand data only in machine readable form (mostly numbers)
1) to ensure the data we are using is within practical and useful range for each feature or variable
2) to ensure each data we are using is logically and practically correct and is free from errors based on the business scenario it is going to be used for
3) to ensure it is in the desired format as required or suitable for the specific ML model it is going to be run on
Answers ( 4 )
Data needs to be transformed in a format which machines can make sense of and
we can get the best out of the ML models. For example, when building a Linear model
it is necessary to perform one-hot encoding to convert the Categorical variables into features, so
that the ML can model can make sense out of it.
Also, other techniques like treating missing values, dimensionality reduction etc. help to get
our data in a better shape for applying ML algorithms. So, yes, Data Cleaning is important.
Data Cleaning plays a vital role in building a model. Converting the data into machine understandable language is imperative for training a model to deter an accurate result. Changing the data types of features according to the demand of the model. Handling of missing values in the data, Treating categorical variables using Ordinal Encoder and Numerical variables using Scaling or Normalising the data using Scalar encoder or Normalisation method.
some parts of your answer come under data preprocessing not exactly data cleaning , isnt it?
Importance of Data cleaning-
ML models can understand data only in machine readable form (mostly numbers)
1) to ensure the data we are using is within practical and useful range for each feature or variable
2) to ensure each data we are using is logically and practically correct and is free from errors based on the business scenario it is going to be used for
3) to ensure it is in the desired format as required or suitable for the specific ML model it is going to be run on