Pandas Interview Questions – Part 3

Pandas Interview Questions
This is the second part of the Pandas Interview Questions, in the first part we touch base on the basics of Pandas, in the second we moved to some more concepts.
The sole aim of the post is to cover around 50-60 most asked Pandas questions asked in the Analytics interview (Data Analyst, Business Analyst, Business intelligence Engineer, Data Scientist, Product Analyst)

Pandas Interview Questions

21. If data is an ndarray , index must be the same length as data. True or False?
It is always true.

22. What is Pandas Index?
Pandas index is defined as a tool that selects particular rows and columns of data from a data frame. Its task is to organize the data and provide fast access to the data.

23. What is Multiple Indexing?
Multiple Indexing is very useful because it deals with data analysis and manipulation, especially for working with high-dimensional data.

24. How to extract items at given positions from a series?

import pandas as pd
ser = pd.Series(list(‘abcdefghijklmnopqrstuvwxyz’))
pos = [0, 4, 8, 14, 20]

# Solution

25. How will you create an empty data frame in pandas?
A Dataframe is a widely used data structure of pandas and works with 2 D Dimensional array with labeled axes.

import pandas as pd
info = pd.DataFrame()


26. How will you add a new column to the Pandas DataFrame ?

We can add a new column to an existing dataframe:

import pandas as pd
info = {‘one’ : pd.Series([2,3,4,5,6], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’]),

‘two’ : pd.Series([1, 2, 3, 4, 5, 6], index=[‘a’, ‘b’, ‘c’, ‘d’, ‘e’, ‘f’])} info = pd.DataFrame(info)

print (“Add new column by passing series”)
print (info)

print (“Add new column using existing DataFrame columns”)
print (info)

27. What is query function in pandas ?
We sometimes need to filter a dataframe based on a condition or apply a mask to get certain values.

Let’s First Create a Simple DataFrame :

import numpy as np
import pandas as pd
value_1 = np.random.randint(10, size=10)
value_2 = np.random.randint(10, size=10)

years = np.arange(2010,2020)
groups = [‘A’,’G’,’B’,’K’,’B’,’B’,’C’,’A’,’C’,’C’]

df = pd.DataFrame({‘group’:groups, ‘year’:years, ‘value_1’:value_1, ‘value_2’:value_2})


It is very simple to use the query function. It is only required to write conditions inside a query function.


28. How will you delete rows from a pandas data frame ?
For Deleting the rows from a data frame we can use the drop() method of the pandas library.

Code :

import pandas as pd

data = {‘name’: [‘Jason’, ‘Molly’, ‘Tina’, ‘Jake’, ‘Amy’], ‘year’: [2012, 2012, 2013, 2014, 2014], ‘reports’: [4, 24, 31, 2, 3]}

df = pd.DataFrame(data, index = [‘Cochice’, ‘Pima’, ‘Santa Cruz’, ‘Maricopa’, ‘Yuma’])

df.drop([‘Cochice’, ‘Pima’])

29. How will you get the number of rows and columns of a Dataframe in pandas?
We can use the shape() method to finding the number of rows and columns in a data frame.

import pandas as pd
import numpy as np

raw_data = {‘name’: [‘Willard Morris’, ‘Al Jennings’, ‘Omar Mullins’, ‘Spencer McDaniel’],
‘age’: [20, 19, 22, 21],
‘favorite_color’: [‘blue’, ‘red’, ‘yellow’, “green”],

‘grade’: [88, 92, 95, 70]}
df = pd.DataFrame(raw_data, columns = [‘name’, ‘age’, ‘favorite_color’, ‘grade’])
# get the row and column count of the df

30. Why do we use the insert function in pandas ?
As we know whenever we want to add a column to the data frame , it is added to the last by default. But Pandas provides us the option that we can add a column at any position by using Insert Function.

We need to specify the position wherever we want to insert it. Let’s suppose we want to insert the column at 2nd Position.

new_column = np.random.randn(10)
#insert the new column at position 2
df.insert(2, ‘new_column’, new_column)

Keep Learning !!


