Random Forest and Decision Tree Interview Questions Topic - NLP Interview QuestionsWelcome to the 2200 questions series from The Data Monk, in this series we will cover all the topics in a Question-Answer mode ...
Continue readingHDFS and YARN command cheat sheet
Topic - HDFS and YARN command cheat sheetLet's look into some of the top HDFS commands that you might require while working in a Big Data infrastructure. HDFS and YARN command cheat sheetat sheet
Continue readingFeature selection methods in Model Building
Feature selection methods in Model Building When we deal with Big Data then there is a high chance of having a plethora of columns and rows in our dataset. Selecting a group of columns is the crux of ...
Continue readingInstalling XGBoost and pandas_profiling in MacOS
IInstalling xgboost and pandas_profiling in MacOSWhy are we writing this article to install a simple library?It's because there is a high probability that you will face issues while importing xgboost in your Jupyter notebook on a MacOS. You ...
Continue readingBest Fit Line in Linear Regression
Best Fit Line in Linear RegressionThe best fit line in linear regression is the one which tries to minimize the Residual sum of squares.It is the line which is supposed to give the best predictions on the unseen ...
Continue readingRegression Line vs Line of Best Fit
Regression Line vs Line of Best Fit,understand the difference between the two concepts of Linear RegressionRegression Line vs Line of Best Fit The regression line (curve) consists of the expected values of a variable ...
Continue readingData Science Model with high accuracy in training dataset but low in testing dataset
What do you mean when I say “The model has high accuracy in Training dataset but low in testing dataset"Data Science model interview question Answer by Swapnil
Continue readingMissing Value Treatment by mean, mode, median, and KNN Imputation | Day 5
Missing Value Treatment by mean, mode, median, and KNN ImputationOne of the most important technique in any Data Science model is to replace missing values with some numbers/values.We can't afford to remove the rows with missing values as ...
Continue readingP-value in Linear Regression
P-value in Linear RegressionSuppose a child in the family goes to the school daily and one day, his teacher writes to his mother in the school diary that your son is very naughty and he was found fighting ...
Continue readingHow to solve data science Hackathon – PIMA Diabetes
How to solve data science HackathonWe at The Data Monk always believe in learning while coding and practicing. In this process we come across multiple things and one such learning tool is Hackathon. There are different websites like ...
Continue reading