Question

While working on a data set, how do you select important variables?

Question

Explain your methods.

solved 0

Machine Learning Dhruv2301 55 years 2 Answers 1972 views Great Grand Master 1

About Dhruv2301Great Grand Master

Follow Me

Answers ( 2 )

swaplaw007 Grand Master
0
June 8, 2020 at 12:59 pm

Reply
1) In R, you can use the step() function and pass your model as the parameter, the step function internally builds various models and determines the predictors which goes onto build the best model.
2) In sklearn , feature importance is an inbuilt class in tree based classifiers. you can plot the importance of all the features relative to the most important feature.
3) You can also use the correlation matrix, where you get to know the correlation of all variables with the target variable.
you can eliminate the features which do not possess a strong correlation with the target variable.

Leave an answer

Name*

E-Mail*

Website

Attachment

Browse

Featured image

Browse

Answer*

Previous question

Next question

Avneet Singh Member · Accepted Answer · June 7, 2020

There are various methods of Feature extractions. Variance, correlation, and from the variable are few of them to name. You can refer to the following article: https://towardsdatascience.com/feature-selection-techniques-in-machine-learning-with-python-f24e7da3f36e

To explain:
1. Variance: You can eliminate the features which have very little variance, as they are not giving any insight into the prediction and overfitting the model.
2. Correlated Features: There may be features with high correlations among them, we can keep few and eliminate the others as the kept set are enough to explain the variability by the eliminated one, keeping the correlated features would increase the cost of dimensionality.
3. From the Model itself: This method of feature selection is more time consuming but give us the important features produced by the models itself. In this, we can use 2-3 (any large number can be used depends on you.) models and fit the train set on it and asks the model to produce let say n numbers of important functions and eliminate the others. There are two functions in feature selection module of Sklearn RFE and RFECV, which we can use for the feature selection.

Register Now

Login

Lost Password

While working on a data set, how do you select important variables?

About Dhruv2301Great Grand Master

Related questions

What kind of jobs or career opportunities are present in the Machine Learning domain?

Random Forest

Can you use Linear Regression for Classification?

What are the assumptions of Linear Regression?

What is correlation and what is its range?

Answers ( 2 )

Leave an answer