GRE Verbal | Barron’s 800 Destroyed | Day 10

We are already good with 220 words, lets’s pass that 250 mark.
All the words given below are directly from Barron’s 800 most frequent words.

251. abeyance – temporary suspension

If you have ever created an Angle Priya profile, there is a high chance that your profile must have been fell into abeyance 😛

252.accretion – growth in size

What is acceleration ?
Growth in speed, it sounds like accretion

253. aggrandize – to make larger or greate4

Can you spot the word GRAN in aggrandize? it’s because it grows in size

254. allure – the power to attract by charm

Easy

255. amalgamate – to combine into one unite

This was last used in Chemistry, where you amalgamate two elements to make an alloy or something.

You also amalgamate ideas

256. ambiguous – unclear

Easy

257. ambivalence – the state of having conflicting emotional attitude

okay, so ambi means both, example – ambidextrous means a person who uses both the hands equally good

So, ambi means tow equal things and valence is emotion, so when you have two emotions, then they clashes

258. amenity – something that increases effort

Wo sab toh thik hai, but what all amenities are there in the hotel?

259. ardor – great emotion or passion

Ardor 2.1 is an awesome pub in Gurgaon(India), and what is it famous for? It’s awesome whiskey which will fill your spirit with emotion and passion

260. argot – a specialised vocabulary used by a specific group

Heyy..whatsapp man? I ain’t doing that thing no more bro

This is the teenage argot

261. beneficent – doing good; generous

A beneficent landowner or a beneficent democracy

262. burgeon – to grow and flourish

Berger is a paint company which helps in flourishing your apartment

263. burnish – to polish

burnish sounds like furnish which means to polish

Highly burnished armour

264. castigation – punishment

If you don’t castigate him when he actually makes the mistake, he will get away with it every time and never improve himself

265. catalyst – something which ameliorate a reaction

you know the meaning of catalyst, now with this you know ameliorate as well..Yeah, I am awesome 🙂

266. chasten – to correct by castigation/punishment

Try to use the words which you are learning on the way.
chasten is to correct something by giving punishment, example Mohammad Asif (match-fixing) 😛

267. chicanery – trickery or fraud

The word itself looks like a tricky one !!
Back in other days, a horse trade was often tinged with fraud and chicanery

268. cozen – To mislead by trick

chicanery and cozen are bhai-bhai
Please learn and remember all the synonyms

269. craven – cowardly

braven is someone who is brave, thus craven is someone who is coward

270. defame – to malign(Malinga wala example), to harm someone’s reputation

A certain set of people tried to defame Sachin Tendulkar by dragging his name in match-fixing

271. demur – to express doubt

After some demur, Nitin accepted the food offered to him by the fellow passenger in the train.

Don’t be like Nitin, you should accept the cold drink as well. It goes well with the food

272. denizen –a regular visitor; inhibitator

Remember, every Den has a denizen

Easyyyy

273. Desiccate – to dry completely

In summer our lips desiccates, so keep yourself hydrated

274. discrete – distinct

Please mention the discrete responsibilities of the UN

275. doggerel – poor verse

I will try to make one, you should also comment your doggerel

When life gives you aata
Don’t make a lachcha paratha

276. dross – waste; worthless matter

alchemist tries to create gold from the dross

277. enhance – to increase; improve

Work on the enhancement of the model

278. exculpate – to clear of blame; disabuse

See, We will always get questions where we need to fill one blank with two synonyms. So, please keep the synonyms in mind

279. exigency – crisis; urgent requirement

Yaar, exigency toh sound hi kr rha hai emergency jaisa 😛

280. filibuster – use of obstructive tactics in a legislature to block passage of law

This is too specific. Couldn’t find a good way to learn it, so I learnt it 😛

Keep Learning 🙂
Target 330


5 Most Important SQL questions before you appear for your Data Science Interview

SQL is the bread and butter of an analyst. You can’t survive in the Data Science industry with a grip on this ‘easy-looking’ query language. I have been interviewed for more than 30 companies in the past 3-4 years. SQL rounds are mostly a rapid fire round where you either keep on answering all the answers or start missing them after a threshold.

This is one of those rounds in which you can impress the interviewer. Recently, I have been taking interviews and I can assure you that most of the logics asked in the interviews are repeated, no upto some extent, but completely repeated 😛

Following is the point charter:-
5 correct – SQL God, You are going to nail 9/10 SQL interviews
4 correct – Really Good, clearing the SQL rounds should not be a problem
3 correct – Ummm..Dicey, you should be able to crack a few rounds
2 correct – You need at least 2-3 weeks before you start applying
1 correct – Padh lo Beta, sochna bhi nai apply krne ka
0 correct – Try an MBA/MS in Analytics/SBI PO/UPSC

Please comment your answer or send it directly to me over Linkedin

Q1. Suppose there is a Movie Theatre with 26 rows(A,B,C..Z) and in each row you have 6 seats. The structure of table is given below

DateRow_NoSeat_NoOccupiedName
04-Apr-20AA1YesKuchi Bhi
04-Apr-20AA2YesKuchi Bhi
04-Apr-20AA3No
04-Apr-20AA4No
04-Apr-20AA5No
04-Apr-20AA6No
04-Apr-20BB1YesKuchi Bhi
04-Apr-20BB2No
04-Apr-20BB3No
04-Apr-20BB4YesKuch Bhi
04-Apr-20BB5No
04-Apr-20BB6YesKuchi Bhi
Table Name – PVR

First, let me know all the starting seats where the number of consecutive vacant seats is 2 (B2 here)
Secondly, write a generalised approach to solve for any number of vacant seats. Basically, you need to create a table with two columns,
1. Seat_No
2. Number of consecutive vacant seats

Q2. There are multiple ways to get the 3rd highest salary, write down at least three. This question is important because the moment you tell the interviewer the first way, he/she will ask to solve the same in any other way

Hint –
1. Naive Approach
2. Inner Query
3. Ranking


Q3. I don’t remember any interview which doesn’t have this question

Table ATable B
11
11
11
1
1

There are two tables, column name in Table A is X and in Table B is Y

How many rows will the resultant have, if you do:-
a. inner join
b. left join
c. right join
d. outer join
e. cross join

Justify your answer

Q4. I own 5 restaurants in Bangalore, following is the table of business, get me the date on which each of these restaurant crossed a total revenue of Rs.10,000

RestaurantDateRevenue
Bangalore Mandrian01/04/205000
Bangalore Mandrian02/04/204000
Bangalore Mandrian03/04/203000
Bangalore Mandrian04/04/206000
Boondock Bistro01/04/207000
Boondock Bistro02/04/203000
Boondock Bistro03/04/205000
Oliver’s02/04/2011000
Oliver’s03/04/208000
Oliver’s04/04/209000
Whitefiled Social02/04/2010000
Punjabi By Nature03/04/208000

Q5. There is a table with employee and manager information, the schema is given below. Write a query to get the EmployeeName and ManagerName.

Hint – Consider the edge cases, i.e. your query should cater the Manager Name of the CEO as well

You can access – 300+ Data Science Interview Questions covering SQL,R,Python,Case Studies, Guesstimates, Statistics, and Machine Learning Questions in the embedded link.

Keep Learning 🙂

The Data Monk

Cross Validation and varImp in R

I was onto our next book – Linear,Ridge, LAASO, and Elastic Net Algorithm explained in layman terms with code in R , when we thought of covering the simple concepts which are quite helpful while creating models.

Cross Validation is one simple concept which definitely improves the performance of your model. A lot of you must be using this to create a k-fold cross validation

Let’s quickly go through this relatively simple concept and there is no better way than starting with code

cv <- trainControl(method="repeatedcv",
number=10,
repeats = 5,
verboseIter = T
)

Here we are creating a variable which holds a property i.e. whenever this variable ‘cv’ is called, it will ask the model definition to divide the dataset in 10 equal parts and train the model on 9 parts while testing on the last one i.e. Train on N-1 data points

repeats = 5 means the above process will repeat 5 times i.e. this 9-1 split train and test is done 5 times.

What would you do with this regressive training?
We will compute different Root Mean Square Error, R Square and Mean Absolute Error, and will then decide the best model.

And this is how we use it in a Ridge model

ridge <- train(medv~.,
              BD,
              method = 'glmnet',
              tuneGrid=expand.grid(alpha=0,lambda=seq(0.0001,1,length=10)),
              trControl=cv
              )

So, here we are creating a Ridge Regression model, predicting the value of medv on the dataset BD and the package/function is glmnet, the tuning parameter tells the model that it’s a ridge model(alpha=0) and a total of 10 numbers ranging from 0.0001 and 1 (Equally spaced)

After all this we specify the model to use the cross validation with trControl parameter

The next function which I love while creating models is varImp. This is a simple function which finds out the most important variables in a set of variables. I think it’s a part of the caret package(do check)

varImp(Lasso, scale = F)

Here we have at least 3 and at max 4 important variables to consider in the model. You can also plot the same using the below function

plot(varImp(Lasso,scale=F)

Just a short article covering a couple of concepts.

Keep Learning 🙂

The Data Monk

Ridge vs LASSO vs Elastic Net Regression

Ridge and LASSO are two important regression models which comes handy when Linear Regression fails to work.

This topic needed a different mention without it’s important to understand COST function and the way it’s calculated for Ridge,LASSO, and any other model.

Let’s first understand the cost function

Cost function is the amount of damage you are going to incur if your prediction goes wrong.

In the layman’s term, suppose you run a pizza shop and you are predicting some values for the number of pizzas sold in the coming 12 months. There would definitely be a delta between the actual and predicted value in your ‘Testing data set’, right?
This is denoted by

Sum of Square of Errors = |predicted-actual|^2

i.e. there is 0 loss when you hit the correct prediction, but there is always
a loss whenever there is a variance.

This is your basic definition of cost function.

Linear, LASSO, Ridge, xyz, every algorithm tries to reduce the penalty i.e. Cost function score

When we talk about Ridge regression, it involves one more point in the above mentioned cost function


Ridge regression C.F. = Sum of Square of Error (SSE)
= |predicted-actual|^2 + lambda*(Beta)^2
The bold part represents L2 Regularization
LASSO Regression C.F. = Sum of Square of Error(SSE)
= |predicted-actual|^2 + lambda*Beta
The bold part represents L1 Regularization
Elastic Net Regression =
|predicted-actual|^2+[(1-alpha)*Beta^2+alpha*Beta]

when alpha = 0, the Elastic Net model reduces to Ridge, and when it’s 1, the model becomes LASSO, other than these values the model behaves in a hybrid manner.
V.V.I. Lines of wisdom below

Beta is called penalty term, and lambda determines how severe the penalty is. And Beta is nothing but the slope of the linear regression line.
So you can see that we are increasing the SSE by adding penalty term, this way we are making the present model worse by ourself 😛

The only difference between L1 and L2 Regularisation or Ridge and LASSO Regression is the cost function. And the difference itself is quite evident i.e. (Beta)^2 vs Beta

You already know what alpha is, right? The Prediction variance square

Now lambda is the

LASSO – Lease Absolute Shrinkage and Selection Operator

Why do we need any other regression model?

Say, you have two points in a co-ordinate (assume these two points as your training dataset i.e. only two data points in your training dataset), you can easily draw a line passing through these two line.
A linear regression does the same, but now if you have to test this LR with 7 data points in your test dataset. Take a look at the diagram below

in the above pic, two circle represents the two data points in the training dataset for your LR model, now this model has perfect accuracy on training dataset, but in testing dataset you have 7 different variables where your model will suffer with a large amount of prediction error.
Prediction error is nothing but the perpendicular distance between predicted and actual.

In this case, other regression comes to the rescue by changing the cost function.

Remember, till now cost function was just the Sum of Square of the difference between predicted and actual, correct?

Now we modify the line of regression in such a way that it is less accurate on training dataset but gives a better result in test dataset. Basically we compromised with the accuracy in the training dataset.

Now the line looks something like the one below

We compromise on the training but nails in the testing part 😛

Now we know that we need to reduce the training model’s accuracy, but How do we lose model’s accuracy?
By reducing the coefficient value of the features learnt while creating the model. Iterating the same as mentioned above
Beta is called penalty term, and lambda determines how severe the penalty is. And Beta is nothing but the slope of the linear regression line.
So you can see that we are increasing the SSE by adding penalty term

The key difference between these techniques is that Lasso shrinks the less important feature’s coefficient to zero thus, removing some feature altogether. So, this works well for feature selection in case we have a huge number of features.

The Lasso method on its own does not find which features to shrink. Instead, it is a combination of Lasso and Cross Validation (CV) which allows us to determine the best Lasso parameter.

These regression helps reduce variance by shrinking parameters and making our prediction more sensitive to them.

Remember, when you have less data points, your training dataset in Linear Regression might show a good accuracy, but not a good prediction on the testing dataset. In that case do try Ridge, LASSO, and Elastic Net regression.

We will soon be publishing an article containing complete code covering all these algorithms via a Hackathon solution or on a open source dataset.

Post your questions, if you have any

Keep Learning 🙂
The Data Monk

GRE Verbal | Barron’s 800 Destroyed | Day 9

Barron’s 800 most frequent words are one of the best consolidated list of words which is asked/referred in GRE/GMAT.
Also, this is a personal practice. You can DEFINITELY comment your way of learning a particular word 🙂

Let’s start Day 9 with some easy words and we will

221. celestial – concerned with sky

Simple hai, celestial bodies are stars, moons, etc.

222.cartography – science of making maps

What could be an easier word where the word itself says graphy i.e. a subset of geography which has to do with maps, and I am awesome

223. carnal – related to sexual desires

Have you ever tried Omegle.com ??
It’s a website where people with Carnal desire meets.

224. Cardinal – of utmost importance

Ye sab word yaad krna main mujhe personally jyada problem hota hai.
Google main v kuch jyada “important” nai diya hai cardinal ke baare main.

Do comment anyone, may be I can learn

225. clairvoyant – one who can predict the future

Today is the 10th day of lock-down due to Corona virus. Friends in my facebook group were commenting on the old pics of each other

There was one guy who had posted a picture of Dhoni and Sangakara holding the WorldCup trophy together, a day before the Finale. My friend put a caption ‘Sangakara holding the trophy for the last time’ meaning that India will win the WC

Someone today commented,’Clairvoyant’

226. causal – involving a cause

Ease has, much like cartography

227. commensurate – proportional or similar

One suggestion – When you are planning to learn 800 words in a span of 15-20 days, then you need to have an eye for the patterns in the words.

Here, I can see common hidden in the word commensurate, by common I will refer to something which is common in measurement i.e. proportional

Yahi krna hai bhailog, kripa bani rhegi 😛

228. dupe – to trick

What happens when you are doped? You are vulnerable to be tricked

229. disinterested – not biased or neutral

230. uninterested – bored

Look for the difference in the above words, you would always want to have a disinterested judge in your case
You are uninterested in a Maths class

231. doctrinaire – dogmatic; orthodox

The doctrinaire economic policy

232. efficacy – efficiency

Prove your efficacy by learning all the words today. Look close to the word, eye-to-eye, you can see efficiency hiding under the hood of efficacy.

233. flourish – an embellishment or ornamentation

To flourish is to very good at something,
embellishment is also there in Barron 800 which means to decorate something.
can you see the word ‘bell’? Now imagine a christmas tree where you hang big bells, that is embellishment

234.harangue – long, angry speech, tirade;

They were subjected to a ten-minute harangue by two policemen when they were caught roaming outside during Corona lock-down

235. gustatory – affecting the sense of taste

You want to know how a gustatory person looks like?

This cutie-pie is happy while having a lemon..Ewwww!!

236. futile – bekaaar; useless

The idea of cooking Pizza on a pan proved to be a futile activity.
The idea of opening the innings by Hardik Pandya was a futile idea

237. indolent – habitually lazy; idle

Hindi ahead – Nitin bina kaam ke hilna dulna bhi nai chahta tha i.e. Nitin was so lazy that he did not even wish to move without purpose

Ambani kids are indolent to the pleasure of life

238. incorporate – to add something into another thing which already exist

IT Employee?
Please incorporate the changes discussed over the call

239. implode – explode, but in inner direction

Opposite of explode

240. penchant – inclination

If you are an avid reader then you would have come across this word multiple times. And you have a penchant for reading

241. oscillate – to move back and forth

Easyy fizzy, also remember vacillate also means and sounds the same

242. ostentatious – showy; pretentious

Ostentatious is such a difficult word that if I listen someone using this word I would definitely say that he himself is ostentatious

243. paean – song of joy

Then what was the song of sorrow?
BC mujhe bhi yaad nai hai..let me think…eulogy?
I looked into the sheet to find that I was close but not correct, eulogy means toasting something about the dead.

Dirge is the song of the funeral

I swear I did not look into the sheet, Good work there, Nitin 🙂

244. paucity – scarcity

What was the use of having two words in English with the same meaning and almost the same pronunciation. Paucity and scarcity, vacillate and oscillate, Ram and Shyam and Babu Bhaiya

245. Placate – Soothing;Pacify

Yaar there are at least few more words which are similar in meaning like mollify, alleviate(make less severe)

His presence placated the tense atmosphere. Woooo that was subtle 😛

246. permeable – penetrable

Easy being a science student

247. Placid – Calm

This is one deadly word, placid looks nothing like calm 😛

Placid is like extremely chill insaan, kuch v ho that person will look placid
Example – Dhoni

248. porous – Full of holes

Too easy to discuss

249. plutocracy – society ruled by the wealthy

Pluto is the planet of wealth, cracy comes from democracy i.e. the ruler
The above is a work of imagination, just learn the word 😛

250. quagmire – difficult situation

We are in a quagmire against Corona 🙁

Keep Learning 🙂

Target 330


Linear, LASSO, Elastic Net, and Ridge Regression in Layman terms (R) with complete code – Part 1

Linear,LASSO, Elastic Net, and Ridge Regression are the four regression techniques which are helpful to predict or extrapolate the prediction using the historic data.

Linear doesn’t have any inclination towards the value of lambda.

LASSO takes lambda as 1 and Ridge takes it as 0, Elastic Net is the middle way and the value of lambda varies between 0 to 1.

In this article We will try to help you understand how to build different models from scratch with ready to use code. You don’t even have to download any dataset as the data is already available in R.

The data is called Boston Housing Data and the aim is to predict the price of House in Boston using the following parameters

CRIM: Per capita crime rate by town
ZN: Proportion of residential land zoned for lots over 25,000 sq. ft
INDUS: Proportion of non-retail business acres per town
CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
NOX: Nitric oxide concentration (parts per 10 million)
RM: Average number of rooms per dwelling
AGE: Proportion of owner-occupied units built prior to 1940
DIS: Weighted distances to five Boston employment centers
RAD: Index of accessibility to radial highways
TAX: Full-value property tax rate per $10,000
PTRATIO: Pupil-teacher ratio by town
B: 1000(Bk — 0.63)², where Bk is the proportion of [people of African American descent] by town
LSTAT: Percentage of lower status of the population
MEDV: Median value of owner-occupied homes in $1000s

Access the data i.e. store the data in your local and then explore the basic of the dataset. I always try 5-6 commands to get a gist of the dataset
?DataSet – To the know the column definitions (only in open source dataset)
head(dataset) – To see the first 5 rows of all the columns
str(dataset) – To get data type and first few values
summary(dataset) – To get the mean median percentile max min of each columns, basically you understand the range of numerical data

Before Loading Boston Housing Data, I personally import a few libraries which might or might not help in the analysis..I am Lazy as fuck !!

install.packages("mlbench")
install.packages("psych")
library(caret)
library(dplyr)
library(xgboost)
library(Matrix)
library(glmnet)
library(psych)
library(mlbench)

Understand the basics of the dataset, but first import the data set

data("BostonHousing")
BD <- BostonHousing

Now BD have the complete data set, you can explore the dataset’s column definition by the following code

?BostonHousing

Let's look at the head of the data set
head(BD)

While exploring multiple things, I came across one of the packages in R which has an awesome correlation function pairs.panels(dataset[])
Correlation requires only numeric variables

pairs.panels(BD[c(-4,-14])
The above code will get you all the correlation and scatter plot which will help you understand the distribution as well as correlation between variables. The matrix looks something like the one below

Do try this visualisation, this might look a bit cluttered, but it’s actually gold

If you are not comfortable with the above plot and are more into conventional form of looking at correlation then try the cor() function

cor(BD[c(-4,-14)])

Eliminate collinearity, but why?
Okay, say you want to predict the salary of employees and there is a high correlation between the age and number of working years in the dataset. In this case having both the variable in the model does not make sense as both symbolises the same thing.

High Correlation leads to multicollinearity and thus overfitting

Now, let’s start with Linear Regression Model. The complete code is provided at the end of the tutorial

sam = sample
train and test command creates a division of 70:30 for train and test
Always create a Cross Validation parameter, Here I am creating one with 10 parts and 5 repeats.

#We have 387 observations in train and 119 observations in Test
#Create Cross Validation parameter, in CV training data is split into n #number of parts and each one is trained, after this model is created using #n-1 number of parts and then error is estimated from 1 part, this is #repeated x times. You can use verboseIter to monitor the progress while #the code is running. verboseIter is optional

cv <- trainControl(method="repeatedcv",
                    number=10,
                    repeats = 5,
                    verboseIter = T
                    )

In short, you are creating a parameter to divide a dataset into 10 parts and keep 9 to train and 1 to test it and you are doing it 5 times to eliminate the chances of random bias.
verboseIter = T gives a good experience when you see your code doing some fancy stuff. Take a slow-mo and put it on Instagram 😛

set.seed(34)
linear <- train(medv ~.,
BD,
method='lm',
trControl = cv)linear$results
linear
summary(linear)

We will do all the EDAs in some other tutorial. In this article we are only focusing on covering the explanation and code of each Regression types

This was the basic Linear Regression, we will evaluate all the models at the end of the series. First let’s create all the models

Next is Ridge Regression

set.seed(123)
ridge <- train(medv~.,
BD,
method = 'glmnet',
tuneGrid = expand.grid(alpha=0,
lambda = seq(0.0001,1,length=10)),
trControl=cv)


We will cover only Linear and Ridge Regression here.
In the next article we will cover LASSO and Elastic Net.
The third article will have the complete evaluation, picking up the best model, and predicting the test cases

GRE Verbal | Barron’s 800 Destroyed | Day 8

Today we will cross the 25% mark and you would have already covered 200+ most frequently used words from Barron’s 800

191. abdicate – to give up a position

It sounds like vacate, right?
The govt. was accused of abdicating its responsibility

192. vivisection – dissection, surgery or painful experience

How I learnt is that you, vivisection is done in plant where we cut down the stem or something and then plant them separately.

I write these stories from my memory and it has nothing to do with the reality. It’s just a way to remember things

193. viable – capable of working successfully , feasible

Rakesh, don’t give me a hypothetical solution, we need a viable answer to the scenario.

Creating a Spacecraft to reach Mars in 24 hours is not at all a viable prototype

194. vexation – irritation, annoyance

Whether you are a girl or a boy, waving is always irritating and thus waxation is irritation..Ooopsss !! It’s vexation, ignore the first two alphabet.

If you like to learn this way, then you are my Boi 😛

195. vacuos – empty

Inspired from vacuum. Easy

196. travesty – parody, exaggerated imitation

This one might be hard to learn, so comments are more than welcome

Have you watched TVF? The first video they created was on Roadies and it was titled as Rowdies. Link below
https://www.youtube.com/watch?v=8sgYXNwNLXg

TraVesty has two alphabets of TV so you can guess that it has something to do with TVF i.e. parody.

197. Transient – Temporary

Transient is also a man who moves from place to place i.e. he/she is temporary.

It’s hard for me to remember the word’s meaning because every time I look at this word, I fee like it has something to do with ‘important’ 🙁

The way I learn the words are transient in nature, but I got to do this

198. tirade – Long, Violent speech

A tirade of abuse.

Remember the Bollywood roast by All India Bakchod, that was a tirade of abuse to the Bollywood stars

199. Tenuous – Weak

It definitely looks like something which deals with tension, but NO!!
Tenuous means weak

200. Tautology – Unnecessary repetition

I will either get paid or i will not get paid

I am not sure how to make you understand in English, but in Hindi or particularly in my regional language we say “Kya taey-taey bol rhe ho” as in why are you repeating so much

Though it definitely sounds like a branch of science or may be its a type of science(Tautology in tautology)

201. Talisman – Charm to bring good luck

Simple meaning, someone who brings charm, generally associated with some pendant or necklace we wear to sue band spirit

Talisman



202. Syllogism – a form of deductive reasoning with a major premise, minor premise and a conclusion.

Okay, so if you have ever prepared for GMAT then you would understand the pain of syllogism. In fact in GRE also we have some questions on the same line in the essays

In syllogism you have a major argument, a minor supporting argument and a conclusion.

203. supposition – the fact of assuming to be true

Easy hai ye, time pass mat kro

204. subside – to settle down or to grow quite

The storm subsides
The flood subside as quickly as they arise

Ye v asaan hn hai

205. subpoena – notice someone to appear in court

Ye wala yaa phir tumhe yaad ho jaeyga, ya kabhi yaad nai hoga 😛

In general a subpoena is done to make it compulsory to appear for a hearing. Mostly, the court has to ask the MLAs/MPs to appear as a part of subpoena

206. stratified – arranged in layers

It helps if you are from a Data Science background. Stratified sampling is the sampling in which say suppose you have 20 rows of data, then you might divide it into 4 groups with first 5 rows as A, then B, etc. i.e. you divide it into layers

207. sporadic – irregular

spora means sowing in Greek, have you ever watch people sowing in field?
It’s ALWAYS irregular. This is the root of sporadic

Another example of sporadic is the spread of any virus like Corona which spreads in a very irregular manner

208. spendthrift – extravagant; one who spends too much

thrift is speed and spend is spend
Nitin was a spendthrift and a heavy gambler

209. singular – unique, extraordinary; odd
singular aise points to one person, one specific person, and if some is termed as singular then he/she is extra ordinary

Plural is crowd, singular is special

210. sextant – navigation tool that determines latitude and longitude

Kya soche? something related to sex??
Nope !!

This is sextant

211. saturate – soak thoroughly

Saturate hona as in ekdum exhaust hona.
The exercise was so saturating

212. rubric – protocol; a set of instruction

Rubiks cube is solved with a set of instructions, a simple set of steps. Same ways rubrics is a set of instructions. It’s mostly written as a heading in a document.

Ever tried to install any game?
There you will find a list of pre-requisites, these are sort of rubric

213. reticent – Silent, reserved

Manmohan singh was a reticent. reticent sounds like remaining silent

Manmohan singh sant aadmi thain, but he always remained silent, he was reticent


214. resolve – determination, firmness of purpose

i always gets confused in these common terms. My way of learning it
He solved, so he was smart
He resolved because he was determined

You can skip this if you already know the meaning

215. resolution – determination

Same as resolve

216. reprobate – morally unprincipled person

My way of learning.
Nitin was on probation, he bantered(passing comments which could or couldn’t have decent meaning) with his colleagues, some times way too much

Nitin was fired in his probation due to his reprobate behaviour

217. pristine – untouched; pure

Easy hai. That’s a pristine white shirt
Pristine copy of the old magazine

218. provident – providing for future need

Provident fund ka naam sune ho?
Have you ever heard of provident fund?
Now it makes much more sense to me at least

219. pusillanimous – cowardly

Remember the word puissant(B800)?
It meant not a pussy i.e. brave
pusillanimous is self explanatory i.e. cowardly

220. precarious – uncertain

a ladder can be precarious? when? When it’s broken 😛
Much like every one else, my choices of career is precarious i.e. uncertain. i am a Data Scientist, I will be writing UPSC this year, I am preparing for GRE just to get those 330 marks, I have also attempted GMAT, I have a small online book business, and I cook good enough to open a cafe.

So, my choice of life is_______

Keep Learning:)
Target 330



GRE Verbal | Barron’s 800 Destroyed | Day 7

It’s such a beautiful day, I already learnt 30 words and I would like to continue. Let’s learn Barron’s 800 words for GRE in the most layman and simple way

161. emollient – mollify; soothing

At least there are 4 words which means soothing, first of all mollify, then alleviate( to lessen), then eMOLLIent

You gotta remember it this way

162. endemic – belonging to a particular area;

There are few diseases which pertains to a particular area. May be Ebola was one of them

163. Pandemic – Belonging to a range of countries

Corona is a pandemic i.e. a lot of countries are affected by this virus

164. entomology – Scientific study of insects

ent – ANT which is an insect

165. ephemeral – short lived

Ephemeral is short lived and perennial is long lived.
Ephemeral sounds like ‘abhi maral’ i.e. he died just now 😛

166. etymology – Origin of history of a word

Have you read the book Word Power Made Easy by Norman Lewis?
It has a very interesting way to learn words by teaching the origin of each word.
It’s definitely an awesome book to start with

167. euphoria – feeling of extreme happiness

Remember Palash Sen? The singer? Kabhi aana tu meri gali
His band was Euphoria and it used to make me very happy 🙂

168. Euthanasia – Mercy Killing

This is a very very specific term, in Asia Japan is the only country which allows or basically have no law against Mercy killing

Japan is the only country to entertain Euthanasia in Asia

169. Exorcise – to expel evil spirits;

Emily Exorcism, to expel evil spirit from Emily

170. Extrapolation – to estimate the projection for future

It’s an easy term to remember if you are form a Data Science Background.
Linear Regression extrapolate the values for future by using the historic data

171. fervor – warmth and intensity of emotion

fever – warmth of body, so is fervor

These animals will love and hate with equal fervor

172. flora – plants of a region

Flora and Fauna, easy

173. Flux – flowing; a continuous moving

If you are mechanical, then it’s easy for you
in-flux and out-flux

174. forestall – to prevent; delay

fore means ahead of time and the word forestall
He forestalled critics by offering a defense of the project.

175. fresco – a painting done on a wall

What a beautiful word, a very very specific word, much like Euthanasia (mercy killing)

Wall painting

176. fusion – union

Atomic fusion, easy
Don’t know why the words am choosing are easy and frequently used !!

177. garrulous – very talkative

Though it looks like the word is something to do with anger, but remember this word has come along with loquacious as synonyms.

What is the antonym of loquacious ? Laconic

178. gerrymander – to divide an area into voting districts in a way that favours a political party

Another special word with special meaning,

Gotta by-heart like this only

179. gregarious – outgoing; sociable

Most of the extroverts are gregarious in nature

180. guileless – not cunning

So, guile means cunning. Guileless means not cunning and so does the word artless.

I met a girl at the airport who was artless 🙂

181. herbivorous – plant eater

Zero-brainer

182. homogeneous – uniform in composition

Zero-brainer

183. hyperbole – purposeful exaggeration for effect

My aunt is a bit of a drama queen, and she uses hyperbole in almost every sentence
Little children often speak almost exclusively in hyperbole.

Hyper – jyada
Bola – Speaking

Hyperbole – jo jyada bolta hai

184. idolatry – idol worship

India follows idolatry

185. impair – to damage; injure

to pair is to grow string, to impair is to damage or injure someone

186. impede – To block; arrest

Remember, the story about arrest?
Arrest is to stop or block and so is impede i.e. it is to stop from exceed

187. ingenuous – Naive and idiot

188. ingenious – Genius

The above two words look the same, the only difference is u vs i
i = Genius
u = Idiot

189. intangible – which can’t be touched

Human feelings are intangible

190. introspective – contemplating one’s own thoughts and feelings

John liked him because he was not introspective or self-critical and quite free from self-consciousness.

There were some easy words and some difficult.

Do keep on revising the words. It’s gonna be a long road, but the end product is definitely rewarding.

Keep Learning 🙂
Target 330


GRE Verbal | Barron’s 800 Destroyed | Day 6

Learn Barron 800 words in simple and layman way for GRE

We will start with 133 and will learn a lot of words today

133. Abscission – Act of cutting; the natural separation of a leaf or other part of a plant

Take a scissor and cut the leaf into two parts – It’s called ab’scission’

134. aesthetic – related to beauty or art?
The aesthetic beauty of the place was mesmeric

135. amulet – ornament worn as a charm against evil spirit

If you are a bollywood movie fan then you can connect amulet with Taabeez or what the actors wore in the movie Zaani Dushman 😛

136. analgesic – medicine that reduces pain; pain-relieving

anaesthetic is used to create partial loss of feeling, same ways analgesic is used to relieve or alleviate(Barron Word) pain

Let’s talk about few pathy related words which are asked in GRE

137. empathy – understand and share feelings of others
138. apathy – Looks like sharing feeling or a good word but it actually means LACK OF INTEREST. He showed apathy in Mathematics from beginning

139. antipathy – pathy means liking, empathy means same feeling, apathy means DISLIKE, what is left? Dislike??
Antipathy is feeling of dislike

140. archeology – Study of material evidence of past Human Life

Easy fizzgy

141. arrest – to stop or block; IMPEDE (learn it, ye aage aane wala hai)

arrest is mostly used to show that someone is captured, but why will Police capture anyone?
Because the Police wanted to stop that person

142. artifact – Items made by human craft

What ever is hand made is an artifact(mostly of cultural importance)

This artifact was created to pour ‘Cutting-Chai’ or Masala Chai to Akbar

143. ascetic

143 means I LOVE YOU in teenage argot(Barron 800, means a language specific to a group).

Ascetic is someone who practices self-denial. What is the name of this website?
The Data MONK – Monk practices ascetic lifestyle

144. astringent – Harsh and severe

When ever I had to write a letter to principal to complain about a student. I always ended the letter like this

‘Sir, Please five stern and stringent punishment to Amit for his bad behaviour’

If you have a friend name Harsha, and if he is stern then the word to describe him is astringent

145. asylum – Place of Refugee

Easy hai

146. austere – very simple; stern

In this pandemic of corona, everyone decided to celebrate an austere christmas

147. bifurcate – to divide into two parts

Machine Learning algorithms are bifurcated into Supervised and Unsupervised algorithms

148. bovine – cow like

Though it’s unlikely that GRE uses this word but remember the first three alphabets

bov – b+1 o v+1 = cow like

149. broach – to mention for the first time

how do you ask a person who is new to your team?
‘Hey Broo, Let me coach you’
Broach – to mention for the first time

150. chivalry – bravery and good behaviour towards women

Easy hai

151. complement – something that completes or makes up whole

In a right angle triangle the two non-90 degree angles are complementary


152. Connoisseur – Expert in matters of taste

I would love to be a chocolate/wine connoisseur

153. Converge – Tend to meet; towards something

Easy hai, converge and diverge

154. Deride – To belittle someone or to mock

Below are two eggs deriding the middle one

155. Derivative – something that is derived; unoriginal

To derive from something i.e. It was not original. When you derive a formula, you start from something basic and then go ahead to prove some other formula/theory

156. discrepancy – difference

There is a discrepancy between the numbers which you are reporting and the numbers which Harish is reporting to the CEO

157. Egotistical – Excessively Self-centred

Bahut jyada ego, matlabe hadd se jyada ego.
I myself is an egotistical human

158. Elegy – Poem or expression of grief

In the complete Barron 800 series, there are three words which is related to death – Dirge (song sung when someone dies), eulogy(speech after someone dies), and Elegy(song of grief)

Learn these

159. Elixir – A substance believed to have the power to cure ills

Have you ever played the game Clash of the Clans?
It’s a well known strategy mobile game, there we have normal blue elixir and dark elixir.
Blue elixir was used to heal normal troop
Dark elixir was used to heal heavy troop.

160. embellish – enhance
I added a lot of bells in order to embellish the Christmas tree

You can comment and add more things to help everyone understand

Keep Learning 🙂
Target 330

One Hot Encoding – Feature Engineering

So, I just started solving the latest Hackathon on Analytics Vidhya, Women in the loop . Be it a real-life Data Science problem or a Hackathon, one-hot encoding is one of the most important part of your data preparation.

If you don’t know about it yet, then you are definitely missing out on something which can boost your rank.

One hot encoding is a representation of categorical variables as binary vectors. What this means is that we want to transform a categorical variable or variables to a format that works better with classification and regression algorithms.

This is how One Hot Encoding works

How not to do a categorical division?
Basically, if you have a column with Course Details like. Data Science, Software Development, Testing, etc. and you want to use these categorical variable in your model, then the best way to do is to make a column with binary variable with all the variables. So, you will have Data Science, Software Development, Testing will be new columns with values as 0, 1, 2, etc.

Now the problem is that 2>1>0 and the model might treat it as this way. So, to get things sorted you need to specify this to the model that ‘bro, these are all categorical numbers and you dare not treat it as numbers’

What to do?
Create new column as binary column. So, Data Science, Software Development, Testing, etc. with 0 and 1. This whole process is called One Hot Encoding.

Example below

There was some JSON error while directly posting the code, so pasting the screenshot

Sales is the name of the column which we need to predict, splitting the sample into 8:2 and putting it into train and test
Initial column names, here Course Domain and Course Type are the two columns which need One Hot Encoding Treatment
ohe <- c("Course_Domain","Course_Type")
train_data = as.data.frame(train_data)

Put the name of the variables which need OHE treatment at one place and convert the training_data into data frame
dummies_train = dummyVars(~ Course_Domain+Course_Type , data = train_data)

df_ohe = as.data.frame(predict(dummies,newdata = train_data))

Here we are creating and converting the variables into dummy variables. Let's see how the columns are names in the data frame df_ohe
colnames(df_ohe)
[1] "Course_Domain.Business" "Course_Domain.Development"
[3] "Course_Domain.Finance & Accounting" "Course_Domain.Software Marketing"
[5] "Course_Type.Course" "Course_Type.Degree"
[7] "Course_Type.Program"

So, all the variables in the two column were given a new name and each have the value 0 or 1..Awesome !!

df_train_ohe = cbind(train_data[,-c(which(colnames(train_data) %in% ohe))],df_ohe)
colnames(df_train_ohe)
The new list of columns in your training data set are below
colnames(df_train_ohe)
[1] "ID" "Day_No"
[3] "Course_ID" "Short_Promotion"
[5] "Public_Holiday" "Long_Promotion"
[7] "User_Traffic" "Competition_Metric"
[9] "Sales" "Course_Domain.Business"
[11] "Course_Domain.Development" "Course_Domain.Finance & Accounting"
[13] "Course_Domain.Software Marketing" "Course_Type.Course"
[15] "Course_Type.Degree" "Course_Type.Program"

You started with 11 variables, and now you have 16 columns, feed it in your XGB or Linear Regression..By the way, you still have 7 more days for the Hacathon..Try it 🙂

Keep Learning 🙂

The Data Monk