P-value in Linear Regression
Suppose a child in the family goes to the school daily and one day, his teacher writes to his mother in the school diary that your son is very naughty and he was found fighting with another kid. This situation is quite common in schools and like any mother, this child’s mother says no my son is not naughty, it must have been the other kid who provoked him. Simple…agreed?
Now, if you wish to do a Hypothesis test whether this child is really naughty or not, you would presume that he is actually naughty (that would be your hypothesis of interest) but you would try to formulate another hypothesis opposite to this (i.e. your Null Hypothesis) that says that the child is not naughty.
So your two hypotheses would be:
Null: The child is not naughty
Alternative: The child is naughty
Then you would try to find evidence by collecting data and putting it to test.
Coming back to the story from the child’s school…
After some days, there is another complaint from the school of this child fighting with another kid and, again the mother of the kid in question does not accept the fault of her child.
The situation repeats itself again with this kid the third time with a third kid and now the mother becomes suspicious as to probably her child is really naughty.
Interestingly, this third instance in our example actually becomes what is Statistically called the Threshold of Significance (or the level of significance).
P-value in linear regression
When the same child again is reported to fight with another kid, the mother has no other option but to accept that her child is really naughty and he finds ways to fight with other kids. This is actually when it is said that the evidence is “Statistically Significant”. This is accepted to be significant since it has occurred beyond the level of significance.
The above situation is purely hypothetical but could be a subject of a Statistical analysis somewhere.
The Data Monk Interview Books – Don’t Miss
Now we are also available on our website where you can directly download the PDF of the topic you are interested in. At Amazon, each book costs ~299, on our website we have put it at a 60-80% discount. There are ~4000 solved interview questions prepared for you.
10 e-book bundle with 1400 interview questions spread across SQL, Python, Statistics, Case Studies, and Machine Learning Algorithms – Ideal for 0-3 years experienced candidates
23 E-book with ~2000 interview questions spread across AWS, SQL, Python, 10+ ML algorithms, MS Excel, and Case Studies – Complete Package for someone between 0 to 8 years of experience (The above 10 e-book bundle has a completely different set of e-books)
12 E-books for 12 Machine Learning algorithms with 1000+ interview questions – For those candidates who want to include any Machine Learning Algorithm in their resume and to learn/revise the important concepts. These 12 e-books are a part of the 23 e-book package
Individual 50+ e-books on separate topics
Important Resources to crack interviews (Mostly Free)
There are a few things which might be very useful for your preparation
The Data Monk Youtube channel – Here you will get only those videos that are asked in interviews for Data Analysts, Data Scientists, Machine Learning Engineers, Business Intelligence Engineers, Analytics managers, etc.
Go through the watchlist which makes you uncomfortable:-
All the list of 200 videos
Complete Python Playlist for Data Science
Company-wise Data Science Interview Questions – Must Watch
All important Machine Learning Algorithm with code in Python
Complete Python Numpy Playlist
Complete Python Pandas Playlist
SQL Complete Playlist
Case Study and Guesstimates Complete Playlist
Complete Playlist of Statistics