## Cross Validation and varImp in R

I was onto our next book –** Linear,Ridge, LAASO, and Elastic Net Algorithm explained in layman terms with code in R **, when we thought of covering the simple concepts which are quite helpful while creating models.

Cross Validation is one simple concept which definitely improves the performance of your model. A lot of you must be using this to create a k-fold cross validation

Let’s quickly go through this relatively simple concept and there is no better way than starting with code

cv <- trainControl(method="repeatedcv",

number=10,

repeats = 5,

verboseIter = T

)

Here we are creating a variable which holds a property i.e. whenever this variable ‘cv’ is called, it will ask the model definition to divide the dataset in 10 equal parts and train the model on 9 parts while testing on the last one i.e. Train on N-1 data points

repeats = 5 means the above process will repeat 5 times i.e. this 9-1 split train and test is done 5 times.

What would you do with this regressive training?

We will compute different Root Mean Square Error, R Square and Mean Absolute Error, and will then decide the best model.

And this is how we use it in a Ridge model

ridge <- train(medv~., BD, method = 'glmnet', tuneGrid=expand.grid(alpha=0,lambda=seq(0.0001,1,length=10)),trControl=cv)

So, here we are creating a Ridge Regression model, predicting the value of **medv ** on the dataset **BD **and the package/function is glmnet, the tuning parameter tells the model that it’s a ridge model(alpha=0) and a total of 10 numbers ranging from 0.0001 and 1 (Equally spaced)

After all this we specify the model to use the cross validation with trControl parameter

The next function which I love while creating models is varImp. This is a simple function which finds out the most important variables in a set of variables. I think it’s a part of the caret package(do check)**varImp(Lasso, scale = F)**

Here we have at least 3 and at max 4 important variables to consider in the model. You can also plot the same using the below function**plot(varImp(Lasso,scale=F)**

Just a short article covering a couple of concepts.

Keep Learning đź™‚

The Data Monk

## Leave a reply