Question

Oracle Interview Questions | Prediction

Question

What is one way that you would handle an imbalanced data set that’s being used for prediction?

in progress 0

Statistics Dhruv2301 55 years 3 Answers 1411 views Great Grand Master 0

About Dhruv2301Great Grand Master

Follow Me

Answers ( 3 )

Leave an answer

Name*

E-Mail*

Website

Attachment

Browse

Featured image

Browse

Answer*

Previous question

Next question

swap007 Grand Master · Answer 1 · July 11, 2020

You can use techniques like over sampling and under sampling to deal with imbalanced data sets.
You can increase the proportion of the class which has lesser no of observations and decrease the
proportion of the class which has higher no of observations. Libraries like SMOTE can help you to achieve
that. Also using evaluation metrics other than just accuracy can help you to evaluate your model more
accurately.

Priyabrata017 Master · Answer 2 · July 18, 2020

Priyabrata017 Master

0

July 18, 2020 at 11:36 am

Reply

To handle imbalanced dataset, we can use undersampling where the number of instance of majority class is deleted. We can also use Oversampling through SMOTE ( Synthetic Minority Oversampling Technique) to add instance of minority class

Ramya Mamidipaka Member · Answer 3 · July 20, 2020

One approach to addressing the problem of class imbalance is to randomly resample the training dataset. The two main approaches to randomly resampling an imbalanced dataset are to delete examples from the majority class, called undersampling, and to duplicate examples from the minority class, called oversampling.

There are two main approaches to random resampling for imbalanced classification; they are oversampling and undersampling.

Random Oversampling: Randomly duplicate examples in the minority class.
Random Undersampling: Randomly delete examples in the majority class.

Combining Random Oversampling and Undersampling:
This can result in improved overall performance compared to performing one or the other techniques in isolation.

For example, if we had a dataset with a 1:100 class distribution, we might first apply oversampling to increase the ratio to 1:10 by duplicating examples from the minority class, then apply undersampling to further improve the ratio to 1:2 by deleting examples from the majority class.

Register Now

Login

Lost Password

Oracle Interview Questions | Prediction

About Dhruv2301Great Grand Master

Related questions

https://thedatamonk.com/add-question/

Want to get funny velcro morale patches?

Professional Security Guard Service

Advantage and Disadvantage of different sampling method

How do you create a sample data of 1000 rows from a population of 1 Million rows and 100 columns?

Answers ( 3 )

Leave an answer