Tiger Analytics Interview Question | Data Cleaning

Question

Tiger Analytics Interview Question | Data Cleaning

Question

What are the steps for wrangling and cleaning data before applying machine learning algorithms?

in progress 0

Machine Learning Dhruv2301 4 years 1 Answer 626 views Great Grand Master 0

About Dhruv2301Great Grand Master

Follow Me

Answer ( 1 )

Leave an answer

Name*

E-Mail*

Website

Attachment

Browse

Featured image

Browse

Answer*

Previous question

Next question

Priyabrata017 Master · Answer 1 · July 29, 2020

Data wrangling is a process by which we convert and map data. This changes data from its raw form to a format that is a lot more valuable.
The important steps in data wrangling are –
1. Acquiring data: This is an extremely tedious process and requires the most amount of time. O Sources for data collection Data is publicly available on various websites like kaggle.com, data.gov ,World Bank, AWS Datasets, Google Datasets.

2. Data cleaning: Data cleaning is an essential component of data wrangling and requires a lot of patience. To make the job easier it is first essential to format the data make the data readable for humans at first. The essentials involved are,find outliers (data points that do not match the rest of the dataset) in data, find missing values and remove them from the data set (without this, any model being trained becomes incomplete and useless) .

3. Data Computation: At times, your machine not have enough resources to run your algorithm e.g. you might not have a GPU. In these cases, you can use publicly available APIs to run your algorithm. These are standard end points found on the web which allow you to use computing power over the web and process data without having to rely on your own system. An example would be the Google Colab Platform.

Register Now

Login

Lost Password

Login

Register Now

Tiger Analytics Interview Question | Data Cleaning

Top Categories