Forecasting and Prediction are two different things. You always forecast weather, but never predict the weather. Forecasting is nothing but a extrapolation of the past. You have some historic data, and you ...
Continue readingExplain p-value in simple terms
p-value in simple termsIf you are into Data Science, then you must have heard about p-value.I could have started it with a very superficial definition strolling around probability and significance and null hypothesis, etc. But that's already there ...
Continue readingConfusion Matrix in Data Science, meaning and example
What is Confusion Matrix?Confusion Matrix is a performance measuring technique for ML Classification model.Why do we need Confusion Matrix? Is measuring accuracy not enough?Confusion Matrix suggests the actual accuracy of your model. For example. Suppose I want to ...
Continue readingData Science vs Big Data vs Data Analytics vs Business Analyst
We often come across few terms which sounds no different but are poles apart. The same goes with Data Science, Big Data,Data Analytics, and Business Analyst. So if you are confused about the role which an employer is ...
Continue readingMachine Learning using SQL – Day 6/100
The below article is the intellectual property of Ashish Kohli. This is one such article which actually powers the ability of SQL. Give it a read guys. Yes, you read that one right! One ...
Continue readingAffine Analytics Interview Questions | Day 17
Company - Affine AnalyticsLocation - BangalorePosition - Senior Business AnalystExperience - 3+ yearsCompensation - Best in the industry Affine Analytics Interview Questions
Continue reading10 Questions, 10 Minutes – 5/100
1. What if you want to toggle case for a Python string? We have the swapcase() method from the str class to do just that. 1. >>> 'AyuShi'.swapcase() ‘aYUsHI’
Continue reading10 Questions, 10 Minutes – 4/100
1.How would you convert a string into an int in Python? If a string contains only numerical characters, you can convert it into an integer using the int() function. >>> int('227') ...
Continue reading10 Questions, 10 Minutes – SQL/R/Python – 3/100
1.What will the following code output? >>> word=’abcdefghij’ >>> word[:3]+word[3:] The output is ‘abcdefghij’. The first slice gives us ‘abc’, the next gives us ‘defghij’. 2.How ...
Continue reading10 Questions, 10 Minutes – 2/100
This is something which has been on my mind since a long time. We will be picking 10 questions per day and would like to simplify it.We will make sure that the complete article is covered in 10 ...
Continue reading10 Questions, 10 Minutes – 1/100
This is something which has been on my mind since a long time. We will be picking 10 questions per day and would like to simplify it.We will make sure that the complete article is covered in 10 ...
Continue readingTop 100 Power BI Interview Questions – Part 1/2
Q1. What are the parts of Microsoft self-service business intelligence solution? Microsoft has two parts for Self-Service BI Excel BI Toolkit It Allows users ...
Continue readingStatistics Interview Questions
Q1. What is a Sample? A. A data sample is a set of data collected and the world selected from a statistical population by a defined procedure. The elements of a sample are known as sample points, sampling units or observations. Q2. Define Population.
Continue readingHow much is the annual income of a beggar in Bangalore?
You can assume anything and everything under the Sun, just to try to keep the assumptions close to realityI always start with an equation, for this question the ...
Continue readingGuesstimate 9 : The number of people wearing watches in Bangalore
Ans: Let’s assume population of Bangalore as 10 million and the day today is a working day for every age group And age group wise population ...
Continue readingGuesstimate 8 : How many red colour Swift cars are there in Delhi ?
Let’s start with the population of Delhi which is 2 Crores. We will divide this population into two groups- 1. Family(80%) = ...
Continue readingGuesstimate 7 : How to estimate the number of ambulances on the road ?
Ans: Let’s start with the population of the country ~ 1.3 Billion (1300 million) Rural - 70% = 900 million Urban 400 million
Continue readingGuesstimate 6 : How do we estimate the area of an Airport ?
Ans: Let’s assume the airport can accommodate 10 planes at once and handle 5 runways. Average length of runway = 2000m Width ...
Continue readingGuesstimate 5 : How much is the surf excel detergent usage in a day in India?
Approach: India has a population of approx 1.2B People. About 20% are BPL and would therefore not use surf excel. Remaining population: o.8*1.2B = 0.96B people.
Continue readingGuesstimate 4 : Estimate the total length of roads in your city
Approach:If we take Mumbai and say Blue Dart, it will take 1 delivery truck for a region like Andheri from the regional distribution centre.
Continue readingGuesstimate 3 – How many paan shops are there in India ?
Ans: We will use the method of Demand and Supply Total population of India = 1.2bn or 1200 Mn Male = 700 Mn ...
Continue readingGuesstimate 1 – Number of Office chairs sold in India
Q1. Estimate the number of office chairs sold in India. Approach:- To estimate the numbers of office chairs sold in India, ...
Continue readingGuesstimate 2 – How many gmail users are there in India ?
Approach The population of India is - 1,300,000,000 i.e. 1.3BInternet penetration in India is - 30% Assumed Population Distribution - ...
Continue reading50 Statistics Questions
Q1. What is a Sample? A. A data sample is a set of data collected and the world selected from a statistical population by a defined procedure. The elements of a sample are known as sample points, sampling units or observations. Q2. Define Population.
Continue reading20 Must have Data Science Questions
1. Differences between Supervised and Unsupervised Learning? Supervised learning is a type of machine learning where a function is inferred from labeled training data. The training data contains a set of training examples.
Continue readingThe Data Monk Booklist
Following are our books published on Amazon1. A complete Data Science interview with 100+ Questions2. Learn Statistics in Python in simple language: Without in-built functions3. Crack Your Next Data Science Interview with 300+ Questions: SQL,Statistics,Python,R,Aptitude,Project ...
Continue readingGradient Boosting in Python
I hope you are already done with the basic concepts of regression and have covered our previous post on Adaptive Boosting. Nai padhe, toh padh lo yaar..easy haiRemember, If you want to participate in a Hackthon or ...
Continue readingAda Boost Algorithm in Python
Gist of Adaptive boost Algorithm in layman's term - If you want to improve the performance of a class then you should concentrate on improving the average marks of the class. In order to increase the average marks ...
Continue readingComplete path to master SQL before interview
We have interviewed a lot of candidates and found out that SQL is still something which is very less explored by people who want to get deep into this domain.Remember - Data Science is not about all about ...
Continue reading10 Must have SQL questions
What are the different Analytic functions available in SQL Server? FIRST_VALUE(): Returns the first value in an ordered set of values. If Partition By clause is specified then it returns First Value in ...
Continue readingTop 5 Regression Techniques
How many regression techniques do you know?Linear Regression?Logistic Regression?These two are the building blocks of your Data Science career, there are N number of Regression models, but we will try to cover the top 5 models which can ...
Continue readingPractice Numpy
Numpy is one of the basic and most used packages in Python. It is mostly used for scientific computing.Most important Data Type in Numpy is Array.import numpy as npAnd you have installed the numpy package in your ...
Continue readingEDA in Python
The complete Machine Learning journey can be penned down in 4 steps:-1. Exploratory Data Analysis - This is the first thing you do when you get a dataset. Before jumping on to building models, you need to ...
Continue readingStart with Python
Python is one of the most preferred language for Data Science. It is needless to discuss the pros and cons of Python over any other language(R/SAS/Java/C).If you are already comfortable with any other language then it's good, but ...
Continue readingLess asked SQL questions
We have already covered queries on joins, aggregate functions, sub-queries, running sum, etc.This article will concentrate on the theoretical part of SQL which are less asked but are important to know about1.What are the different types of statements ...
Continue readingSQL advance concepts
Advance concepts mostly involves knowledge of window functions, cleaning the data format, working with different date formats, etc. Before proceeding, you should actually go through different ways of casting data types and other widows functions.1.Create the cumulative revenue ...
Continue readingSQL Intermediate Questions
Let's get started with some intermediate level SQL queries. Try to frame an approach before hoping to the solution because once you see the answer you will find it easy to understand.I think there will be around 15 ...
Continue readingBasic Queries to get you started
I hope you have already read the introduction part where we discussed about the execution flow of all the clauses in SQL.Today we will discuss around 15 questions which shall give you a good launching pad to ...
Continue readingIntroduction to SQL
Structured Query Language is the base of your Data Science career. You will always be surrounded with SQL queries and it's very important to understand the basics of the language.We will not waste time on defining terminologies associated ...
Continue readingSupply Chain Analytics – Using PuLP in Python
There are three types of programming which we can do in Supply Chain a. Linear Programming – It involves creating a model on continuous variables b. Integer Programming – It involves creating a model on only Discrete ...
Continue readingSupply Chain Analytics
We all have a fair idea about the supply chain. In a layman term, we can say that the supply chain analytics helps in improving the operational efficiency and effectiveness by providing "data-driven" decisions at the operational and ...
Continue readingData Science Terms which are often confused
Data Science = Maths+Code+Business UnderstandingMany a time you come across different terminologies which sounds confusing. We will try to make them easier for you to understand and to remember 1.Data Scientist vs Data AnalystData Scientist ...
Continue readingLinear Regression Part 3 – Evaluation of the model
Check out Part 1 and Part 2 of the series before going furtherLinear Regression Part 1 - Assumption and Basics of LRLinear Regression Part 2 - Code and implementation of model If you ...
Continue readingLinear Regression Part 2 – Implementation of LR
We already know the assumptions of the Linear Regression. We will quickly go through the implementation part of the Linear Regression. We will be using R for this article(Read Sometimes, I am more comfortable in R :P )Remember ...
Continue readingAssumptions Of Linear Regression – Part 1
Here we will talk about the assumptions of Linear Regression which will let you understand LR and will help you tackle questions in your interview. I personally know a handful of candidates who have bosted themselves as a ...
Continue readingGuesstimate – Price of one Kilogram Potato in India ?
Let's try to guesstimate the price of one kg potato in India. There could be multiple ways to do it, but your aim should be to keep the "scope of error" to the minimum.
Continue readingThe measure of Spread in layman terms
Data Science is a combination of Statistics and Technology. In this article, we will try to understand some basic terminologies in Layman's language. Suppose I run a chain of Pizza outlets across Bangalore and have around 500 delivery ...
Continue reading100 Natural Language Processing Questions in Python
What is NLP? NLP stands for Natural Language Processing and it is a branch of data science that consists of systematic processes for analyzing, understanding, and deriving information from the text data in a smart ...
Continue readingKaggle Titanic Solution
Kaggle is a Data Science community which aims at providing Hackathons, both for practice and recruitment. You should at least try 5-10 hackathons before applying for a proper Data Science post.Here we are taking the most basic problem ...
Continue readingBook List
List of the books we have on Amazon:1. The Monk who knew Linear Regression (Python): Understand, Learn and Crack Data Science Interview2. 100 Python Questions to crack Data Science/Analyst Interview Complete Linear Regression and ARIMA Forecasting ...
Continue reading