Explain p-value in simple terms
p-value in simple terms
If you are into Data Science, then you must have heard about p-value.
I could have started it with a very superficial definition strolling around probability and significance and null hypothesis, etc. But that’s already there on multiple blogs.
Interview questions of Top Data Science questions – https://thedatamonk.com/data-science-resources/
How to make a career in Data Science – https://thedatamonk.com/how-to-become-a-data-scientist/
Daily Quiz – https://thedatamonk.com/daily-quiz/
We want to simplify this term in order to make you “understand” rather than remember things.
Let’s understand p-value in simple term
We will start with null hypothesis. What is null hypothesis?
So, Nitin was the monitor of Class VIII B, He has one job i.e. to write the name of those classmates who used to make noise in the absence of the teacher.
One day he wrote the name of Tahseen on the black board. Teacher asked Tahseen whether he was making any noise?
As usual Tahseen denied. Now, the teacher had to either believe the monitor or Tahseen.
He assumed that Tahseen did not make the noise, why? because it’s easier to disprove this.
See, it’s always to disprove something with an example than to prove something. Example, If the teacher catches Tahseen making noise then the Null Hypothesis i.e. Tahseen did not make the noise will be dispropved.
But if we take the null hypothesis as “Tahseen made noise” and you did not catch him making noise on an instance then that does not mean that the null hypothesis is proved.
Coming back to the question
Teacher had this null hypothesis – Tahseen did not make the noise
Alternate hypothesis – Tahseen made noise
Now again the next day Nitin complained that Tahseen was making noise which was again denied by Tahseen.
On the next three days also his name was written on the black-board. Now the teacher has reached a threshold where he can say with confidence that “Dude, you were making noise because you have reached a benchmark of complains and it is statistically significant to prove that my null hypothesis was wrong. Thank you Nitin :)”
This statistical significance is p-value which is nothing but a benchmark set before starting the experiment.
In general a p-value <0.05 is treated as statistically significant which means that there is 95% confidence of rejecting the null hypothesis.
I have appeared for a ton of interviews and it’s very hard to dodge this question.
100 Questions to Master Forecasting in R: Learn Linear Regression, ARIMA, and ARIMAX
What do they ask in top Data Science Interviews: 5 Complete Data Science Real Interviews Q and A
What do they ask in Top Data Science Interview Part 2: Amazon, Accenture, Sapient, Deloitte, and BookMyShow
Keep learning 🙂
The Data Monk
Comments ( 5 )
Great content! Super high-quality! Keep it up! 🙂
Amqzingly explained!
Nice explanation.
Dayammm. I never expected someone can explain the concept of P-Value so nicely. Thank you Nitin
Thank you so much Siddhant 🙂