How would you explain the concept of p-value to a layman ?
If you are into Data Science, then you must have heard about p-value.
I could have started it with a very superficial definition strolling around probability and significance and null hypothesis, etc. But that’s already there on multiple blogs.
We want to simplify this term in order to make you “understand” rather than remember things.
We will start with null hypothesis. What is null hypothesis?
So, Nitin was the monitor of Class VIII B, He has one job i.e. to write the name of those classmates who used to make noise in the absence of the teacher.
One day he wrote the name of Tahseen on the black board. Teacher asked Tahseen whether he was making any noise?
As usual Tahseen denied. Now, the teacher had to either believe the monitor or Tahseen.
He assumed that Tahseen did not make the noise, why? because it’s easier to disprove this.
See, it’s always to disprove something with an example than to prove something. Example, If the teacher catches Tahseen making noise then the Null Hypothesis i.e. Tahseen did not make the noise will be dispropved.
But if we take the null hypothesis as “Tahseen made noise” and you did not catch him making noise on an instance then that does not mean that the null hypothesis is proved.
Coming back to the question
Teacher had this null hypothesis – Tahseen did not make the noise
Alternate hypothesis – Tahseen made noise
Now again the next day Nitin complained that Tahseen was making noise which was again denied by Tahseen.
On the next three days also his name was written on the black-board. Now the teacher has reached a threshold where he can say with confidence that “Dude, you were making noise because you have reached a benchmark of complains and it is statistically significant to prove that my null hypothesis was wrong. Thank you Nitin :)”
This statistical significance is p-value which is nothing but a benchmark set before starting the experiment.
In general a p-value <0.05 is treated as statistically significant which means that there is 95% confidence of rejecting the null hypothesis.
I have appeared for a ton of interviews and it’s very hard to dodge this question.
100 Questions to Master Forecasting in R: Learn Linear Regression, ARIMA, and ARIMAX
What do they ask in top Data Science Interviews: 5 Complete Data Science Real Interviews Q and A
What do they ask in Top Data Science Interview Part 2: Amazon, Accenture, Sapient, Deloitte, and BookMyShow
Keep learning 🙂
The Data Monk