Snapdeal Interview Question | Machine

Question

Snapdeal Interview Question | Machine

Question

If you are having 3GB RAM in your machine and you want to train your model on an 8GB data set. How would you go about this problem?

in progress 0

Machine Learning Dhruv2301 4 years 3 Answers 1025 views Great Grand Master 0

About Dhruv2301Great Grand Master

Follow Me

Answers ( 3 )

Leave an answer

Name*

E-Mail*

Website

Attachment

Browse

Featured image

Browse

Answer*

Previous question

Next question

Onika · Answer 1 · August 3, 2020

Onika

0

August 3, 2020 at 10:44 pm

Reply

1.Try to pre process data and do feature engineering to decrease the size of data
2. distribute the ML algorithm using apache spark or Hadoop

Ognish Master · Answer 2 · August 18, 2020

Ognish Master

-1

August 18, 2020 at 10:04 am

Reply

Some of the techniques that can be applied to process large data:

1. Change the data format
2. Use smaller samples of data
3. Stream data or use progressive loading
4. Using a relational database
5. Using big data platform

Namrata Keshri Newbie · Answer 3 · July 27, 2021

Most datasets contain an awful lot of observations that add nothing to your model. Simply sampling your data will result in a model that is equally good being built in a fraction of the time.
There are a few scenarios where you truly do need that extra data in the model. Many algorithms support processing training data in batches to achieve this.
In case your data is non-sequential, just randomly pick an appropriately sized subset of your data and train on it for a short while. Then choose another subset from the rest of the unused data and repeat. After you exhaust all training samples, repeat the whole cycle from the start. In the case of sequential data, breaking it into multiple consecutive overlapping chunks should work. The best chunk/subset size, the number of training iterations for each chunk, and possibly overlap may depend on the characteristics of the data and may require some fine-tuning. Do not forget to reserve an adequate portion of your data for a test set and a validation set, especially when you have an abundance of data.

Register Now

Login

Lost Password

Login

Register Now

Snapdeal Interview Question | Machine

Top Categories