Pandas Interview Questions – Part 3

Pandas Interview Questions
This is the second part of the Pandas Interview Questions, in the first part we touch base on the basics of Pandas, in the second we moved to some more concepts.
The sole aim of the post is to cover around 50-60 most asked Pandas questions asked in the Analytics interview (Data Analyst, Business Analyst, Business intelligence Engineer, Data Scientist, Product Analyst)

Pandas Interview Questions

21. If data is an ndarray , index must be the same length as data. True or False?
It is always true.

22. What is Pandas Index?
Pandas index is defined as a tool that selects particular rows and columns of data from a data frame. Its task is to organize the data and provide fast access to the data.

23. What is Multiple Indexing?
Multiple Indexing is very useful because it deals with data analysis and manipulation, especially for working with high-dimensional data.

24. How to extract items at given positions from a series?

import pandas as pd
ser = pd.Series(list(‘abcdefghijklmnopqrstuvwxyz’))
pos = [0, 4, 8, 14, 20]

# Solution
ser.take(pos)

25. How will you create an empty data frame in pandas?
A Dataframe is a widely used data structure of pandas and works with 2 D Dimensional array with labeled axes.

import pandas as pd
info = pd.DataFrame()

print(info)

26. How will you add a new column to the Pandas DataFrame ?

We can add a new column to an existing dataframe:

import pandas as pd
info = {‘one’ : pd.Series([2,3,4,5,6], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’]),

‘two’ : pd.Series([1, 2, 3, 4, 5, 6], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’])} info = pd.DataFrame(info)

print (“Add new column by passing series”)
info[‘three’]=pd.Series([20,40,60],index=[‘a’,’b’,’c’])
print (info)

print (“Add new column using existing DataFrame columns”)
info[‘four’]=info[‘one’]+info[‘three’]
print (info)

27. What is query function in pandas ?
We sometimes need to filter a dataframe based on a condition or apply a mask to get certain values.

Let’s First Create a Simple DataFrame :

import numpy as np
import pandas as pd
value_1 = np.random.randint(10, size=10)
value_2 = np.random.randint(10, size=10)

years = np.arange(2010,2020)
groups = [‘A’,’G’,’B’,’K’,’B’,’B’,’C’,’A’,’C’,’C’]

df = pd.DataFrame({‘group’:groups, ‘year’:years, ‘value_1’:value_1, ‘value_2’:value_2})

print(df)

It is very simple to use the query function. It is only required to write conditions inside a query function.

df.query(‘value_1<value_2’)

28. How will you delete rows from a pandas data frame ?
For Deleting the rows from a data frame we can use the drop() method of the pandas library.

Code :

import pandas as pd

data = {‘name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘year’: [2012, 2012, 2013, 2014, 2014], ‘reports’: [4, 24, 31, 2, 3]}

df = pd.DataFrame(data, index = [‘Cochice’, ‘Pima’, ‘Santa Cruz’, ‘Maricopa’, ‘Yuma’])

df.drop([‘Cochice’, ‘Pima’])

29. How will you get the number of rows and columns of a Dataframe in pandas?
We can use the shape() method to finding the number of rows and columns in a data frame.

import pandas as pd
import numpy as np

raw_data = {‘name’: [‘Willard Morris’, ‘Al Jennings’, ‘Omar Mullins’, ‘Spencer McDaniel’],
‘age’: [20, 19, 22, 21],
‘favorite_color’: [‘blue’, ‘red’, ‘yellow’, “green”],

‘grade’: [88, 92, 95, 70]}
df = pd.DataFrame(raw_data, columns = [‘name’, ‘age’, ‘favorite_color’, ‘grade’])
df
# get the row and column count of the df
df.shape()

30. Why do we use the insert function in pandas ?
As we know whenever we want to add a column to the data frame , it is added to the last by default. But Pandas provides us the option that we can add a column at any position by using Insert Function.

We need to specify the position wherever we want to insert it. Let’s suppose we want to insert the column at 2nd Position.

new_column = np.random.randn(10)
#insert the new column at position 2
df.insert(2, ‘new_column’, new_column)
print(df)

The Data Monk Interview Books – Don’t Miss

Now we are also available on our website where you can directly download the PDF of the topic you are interested in. On Amazon, each book costs ~299, on our website we have put it at a 60-80% discount. There are ~4000 solved interview questions prepared for you.

10 e-book bundle with 1400 interview questions spread across SQL, Python, Statistics, Case Studies, and Machine Learning Algorithms – Ideal for 0-3 years experienced candidates

23 E-book with ~2000 interview questions spread across AWS, SQL, Python, 10+ ML algorithms, MS Excel, and Case Studies – Complete Package for someone between 0 to 8 years of experience (The above 10 e-book bundle has a completely different set of e-books)

12 E-books for 12 Machine Learning algorithms with 1000+ interview questions – For those candidates who want to include any Machine Learning Algorithm in their resume and to learn/revise the important concepts. These 12 e-books are a part of the 23 e-book package

Individual 50+ e-books on separate topics

Important Resources to crack interviews (Mostly Free)

There are a few things that might be very useful for your preparation

The Data Monk Youtube channel – Here you will get only those videos that are asked in interviews with Data Analysts, Data Scientists, Machine Learning Engineers, Business Intelligence Engineers, Analytics managers, etc.
Go through the watchlist which makes you uncomfortable:-

All the list of 200 videos
Complete Python Playlist for Data Science
Company-wise Data Science Interview Questions – Must Watch
All important Machine Learning Algorithm with code in Python
Complete Python Numpy Playlist
Complete Python Pandas Playlist
SQL Complete Playlist
Case Study and Guesstimates Complete Playlist
Complete Playlist of Statistics

Keep Learning !!

Thanks,

Author: TheDataMonk

I am the Co-Founder of The Data Monk. I have a total of 6+ years of analytics experience 3+ years at Mu Sigma 2 years at OYO 1 year and counting at The Data Monk I am an active trader and a logically sarcastic idiot :)