EXL Interview Questions| Data Subsets

Question

Say you have two subsets of a dataset for which you know their means and standard deviations. How do you calculate the blended mean and standard deviation of the total dataset? Can you extend it to K subsets?

in progress 0
Dhruv2301 4 years 1 Answer 1595 views Great Grand Master 0

Answer ( 1 )

  1. n1= No. of observations in ‘region 1’
    n2= No. of observations in ‘region 2’
    X1= mean of region 1.
    X2=mean of region 2.
    S12= variance of region 1.
    S22= variance of region 2.
    Find the mean of total group as
    (n1*X1+n2*X2)/(n1+n2)
    Find the variance of total group as
    n1*(S1^2+d1^2)+n2*(S2^2+d2^2)/(n1+n2)

    Yes, we can extend it to k subsets

Leave an answer

Browse
Browse