SAP interview Questions | Various Variables

Question

How to find the correlation between the categorical variable and the continuous variable?

in progress 2
Dhruv2301 55 years 3 Answers 1008 views Great Grand Master 0

Answers ( 3 )

  1. The Chi-Squared test of independence (and subsequent Cramer’s V test) gives an indication of the relationship between two categorical variables.It measures the significance of the association of two categorical variables and does not speak about its strength

    For the Ordinal variables, Spearman’s correlation to understand whether there is an association between them

    Pearson’s correlation coefficient measures the strength of the linear relationship between two variables on a
    continuous scale.

  2. If there are only 2 variables, one continuous and one categorical, finding correlation
    is not feasible because correlation ideally measures how much linear dependency is there
    between these two variables – if one variable increases whether another one increases or decreases.
    However, in a supervised learning setting, when both variables are independent, you can
    perform one-hot encoding on the categorical variables and get a correlation matrix.
    You can also use ANOVA test which determines whether a categorical variable has a significant effect
    on the value of a continuous variable.

  3. Case 1: When an Independent Variable Only Has Two Values

    Point Biserial Correlation:
    If a categorical variable only has two values (i.e. true/false), then we can convert it into a numeric datatype (0 and 1). Since it becomes a numeric variable, we can find out the correlation using the dataframe.corr() function.

    Case 2: More Than Two Values: Use ANOVA (Analysis of Variance)

Leave an answer

Browse
Browse