## Statistics

Statistics Data Science Interview Questions

These Statistics Data Science Interview Questions will help you answer most of the statistics related questions in an analytics interview. You just need to check if you understand the concept. Go through the answers from other users and you will understand the logic much more clearly

ANSWERED

- Find the coefficient of variation.
- According to the empirical rule, approximately what percent of the data should lie within μ±2σ?
- Which of the following describe the middle part of a group of numbers?
- Explain the concept of p-value in simple terms.
- Calculate entropy.
- Explain type 1 error in simplest terms.
- Explain type 2 error in simple terms.
- Define R squared error.How do you measure distribution?
- What is worse, type 1 error or type 2 error?
- Which test to use when you have less than 30 sampling units?
- Differenciate between sample and population variance on the basis of formulae
- What percentage of value lies between Mean and one Standard deviation(both positive and negative)?
- What are the absolute measures of dispersion?
- Differentiate between chi-square, z-test and t-test.
- When is median a better measure than mean?
- What is the relation between Power and Size of a biased and unbiased test ?
- When to use Mean and Median while doing missing value treatment?
- What is positive skewness and negative skewness?
- The probability that item an item at location A is 0.6, and 0.8 at location B. What is the probability that item would be found on JP Morgan website?
- What is the probability of drawing a white marble at least once?
- What is the probability of picking consecutive numbers if the sample is arranged in descending order?
- What is the range of values for multiple linear correlation coefficient ?
- How to deal with seasonality?
- In time series modelling how can we deal with multiple types of seasonality like weekly and yearly seasonality?
- What is the difference between Type-1 and Type-2 error?
- What will be your expected payout in the given scenario?
- Which box has higher probability of getting the same colour?
- What is Fourier transform?
- What is ANOVA test?
- What is the Central Limit Theorem?
- What is the probability that the same color marble is drawn twice?
- What is ADF test in time series analysis ?
- What is p-value?
- Explain it using some examples.
- Explain Holt winters in brief.
- How useful is the boxplot graph?
- Define ARIMA in simple terms.
- What’s the probability of rolling at least one 3?
- How many games do you expect to win?
- There are 6 marbles in a bag, 1 is white. You reach in the bag 100 times. After drawing a marble, it is placed back in the bag. What is the probability of drawing the white marble at least once?
- What is the middle value of an ordered array?
- What is AIC in arima?
- What is the sum of deviations about the mean?
- What is the relationship between sample size and margin of error?
- How can you tell if a given coin is biased?
- What is Degree of Freedom in ARIMA?
- How do you assess the statistical significance of an insight?
- What is the probability that at most three people show up in a four-hour period?
- What is the probability that you picked a fair coin?
- What’s the probability that Aman wins?
- Given two fair dices, what is the probability of getting scores that sum to 4? to 8?
- Let’s say you’re playing a dice game. You have 3 die. What’s the probability of rolling at least one 6?
- Explain a probability distribution that is not normal and how to apply that?Give an example of outlier values and how can they be treated?
- Do having more outlier values a good thing or a bad thing?
- Why is it important to have information about the bias-variance trade off while modelling?
- What is the probability of getting all the three cards red, when there are 5 black, 6 red and 7 blue cards?
- What are statistics, under what circumstances they go out of date, how do you update them?
- What is the expected payout?
- What might be the benefits of running an A/A test, where you have two buckets who are exposed to the exact same product?
- What is Variance Inflation Factor ? How is it used?
- Which test is used for more than two independent population location parameter?
- What is unbiasedness as a property of an estimator? Is this always a desirable property when performing inference?
- How would you go about investigating if a certain trend in distribution is due to an anomaly?
- considering a positive test, what is the probability of having that condition?
- You have a deck and you take one card at random and guess what the card is.
- What is the probability you guess right?
- How would you build and test a metric to compare two user’s ranked lists of movie/tv show preferences?
- You are given N positive sample and (4N-2) negative sample. How will you calculate the entropy?
- Why L1 regularizations cause parameter sparsity whereas L2 regularization does not?
- What is the statistical test for data validation ?
- What do you think about the fairness of the coin?
- What is the probability of getting a HTT combination before getting a TTH combination?
- What is the probability of getting all the three cards red, when there are 5 black, 6 red and 7 blue cards?
- A fair six-sided die is rolled twice. What is the probability of getting 1 on the first roll and not getting 6 on the second roll?
- What is unbiasedness as a property of an estimator?
- You have built a multiple regression model.
- Your model R² isn’t as good as you wanted. For improvement, you remove the intercept term, your model R² becomes 0.8 from 0.3. Is it possible? How?
- UNANSWERED
- How do you generate a uniform number using a non-uniform distributed function?
- What is ADF test?
- Why L1 regularizations cause parameter sparsity whereas L2 regularization does not?
- How many cards would you expect to draw from a standard deck before seeing the first ace?
- A coin is flipped 1000 times and 560 times heads show up. Do you think the coin is biased?
- You are provided a simple coin. But the problem is that we are not sure whether it is biased or not? What should be your approach to check the fairness of the coin?
- You want to run a regression to predict the probability of a flight delay, but there are flights with delays of up to 12 hours that are really messing up your model. How can you address this?
- What is cross entropy loss?
- Which type of error related to Hypothesis testing could prove to be fatal or catastrophic? What is an example of a data set with a non-Gaussian distribution?
- Given three random variables independently and identically distributed from a uniform distribution of 0 to 4, what is the probability that the median is greater than 3?
- If we talk about Type-I and Type-II error, which of them will create critical mistakes in your data?
- What is the role of trial and error in data analysis?
- What do you understand by Maximum Likelihood Estimation?
- How do predict “x” at time t-2?What is the exact test?
- How would you assess the validity of the result by AB test?
- What do you understand by statistical power?
- Let’s say you can play a coin flipping guessing game either once or a 2 out of 3 game. What is the best strategy for winning?
- How will you test that there is an increased probability of a user staying active after 6 months given that a user has more friends now?
- What is the probability that you roll at most a single six to win a total amount of $6X?
- How can you generate a random number between 1 – 7 with only a die?
- Why L1 regularizations cause parameter sparsity whereas L2 regularization does not?
- What are Bias and Variance?
- What is the expected value of a binomial random variable in a binomial distribution?
- How do you assess the validity of the result?
- Given draws from a normal distribution with known parameters, how can you simulate draws from a uniform distribution?
- What is the expected number of draws from a standard deck until you see an ace?
- For a sample size of N, the margin of error is 3. How many more samples do we need for the margin of error to hit 0.3?
- How do you assess the statistical significance of an insight?
- When is a nonparametric test used by Data Scientist? Explain its advantages.
- What would be the hazards of letting users sneak a peek at the other bucket in an A/B test?
- Which algorithm should you use to tackle?

Derive the expectation for a geometric distributed random variable.
What is ADF test while checking Time series data ?

What is the probability that those chords will intersect? Given a random Bernoulli trial generator, how do you return a value sampled from a normal distribution?
You are given a coin that is calibrated to land on heads more than it would on tails. How will you ensure a fair coin toss?

