EXL Interview Questions| Data Subsets
Question
Say you have two subsets of a dataset for which you know their means and standard deviations. How do you calculate the blended mean and standard deviation of the total dataset? Can you extend it to K subsets?
in progress
0
Statistics
4 years
1 Answer
1595 views
Great Grand Master 0
Answer ( 1 )
n1= No. of observations in ‘region 1’
n2= No. of observations in ‘region 2’
X1= mean of region 1.
X2=mean of region 2.
S12= variance of region 1.
S22= variance of region 2.
Find the mean of total group as
(n1*X1+n2*X2)/(n1+n2)
Find the variance of total group as
n1*(S1^2+d1^2)+n2*(S2^2+d2^2)/(n1+n2)
Yes, we can extend it to k subsets