Statistics gyan – TheDataMonk

Home » Statistics gyan » Statistics gyan

Statistics gyan

What is confusion matrix?
Confusion matrix is a 2×2 matrix consisting of True/False and Positive Negative. This matrix is typically used in prediction world to understand the effectiveness of an algorithm.  The first part of the object is Actual and the second part of each object if Predicted. In True-Positive object, the first True is for actual and the Positive is for Predicted

True – Positive True – Negative
False – Positive False – Negative

Q.) What is True-Positive?
A.) This means Actual value is true and predicted is also positive. Example, if we have to predict the disease whether present in a patient using some model. Then, the 1st block suggests the cases for which we predicted yes and they actually were suffering from the disease.

For a predictive model or a classifier – This value should be high

Q.) What is True-Negative?
A.) This means Actual value is true and predicted is negative. Example, We predicted that a patient is not suffering from a disease and he is found not suffering as well.

For a predictive model or a classifier – This value should also be high

Q.) What is False-Positive?
A.) This is also known as Type-1 error. Here we predicted yes, but they don’t actually have the disease.

This indicates an error in your algorithm. And since we almost always deal with sensitive data, so this value should be as low as possible. Suppose we predicted that a patient is suffering from diabetes and the doctor prescribed based on our algorithm and later found out that the patient was not suffering from diabetes. So this will raise concern

Q.) What is False-Negative?
A.) This is also known as Type-2 error. Here we predicted no, but they actually have the disease.

This is the major concern of an algorithm. No matter how accurate the model is, if the accuracy for False-Negative is low, then the model should not be introduced.

This indicates an error in your algorithm. Suppose we predicted that a patient is not suffering from Cancer and later found out that the patient was suffering, then there is not much use of the algorithm.

 

Let me know if you need more example to understand this. For more such question, go here


Leave a comment

Your email address will not be published. Required fields are marked *