from pulp import *
Here you are importing the complete package
model = LpProblem(“Maximize Pizza Profit”,
LpMaximize) Here you are defining the model using LpProblem function. The LpMaximize
will look for maximizing the value i.e. Profit. If you want to get the minimum
value from the model then use LpMinimize. We can use LpMinimize when we are
talking about reducing the wastage.
A = LpVariable(‘A’,lowbound=0,upbound =
None,cat=’Integer’) Here we define each Variable using LpVariable function. Lowbound refers to
the lowest possible value of the variable.
Pizza can not be negative so we have given the value 0, Upbound is the maximum
value of the variable.
None will ensure that the upbound could be anything
cat is the characteristic of the variable. It could be integer, categorical, or
model += 1*A + 0.5*B + 1*C <= 30 This is the constraint for Oven. A requires 1 day, B requires 0.5 Day, and
C requires 1 Day. The <=30 is the constraint which is because there is one
oven which will work for 30 days
model += 1*A+2*B+2*C <=90 Similar to the above, the Baker will need 1, 2, and 2 days for A,B, and C
respectively. And there are 3 Bakers which work 30 days. Thus constraint is
30*3 = 90
model += 1*A+1*B+1*C <= 40
A packer takes 1,1,and 1 day for A,B, and C pizza. And there are 2 Packers
which works 20 days each. Thus constraint is 40
The word “Supervised” means monitoring. A supervised learning algorithm is one in which you train a data set on output and then the model takes up these inputs and predicts the outcome. Confusing?
Let’s try an example You own a restaurant and you have collected various information about the customers like Name, Status, Job, Salary, Address, Home town, Food item they ordered, etc. Now you want to make a recommendation engine where a new customer’s data is used to give that customer a free dish. You took the data of all the customers and fed it into your model. Now this model knows that if a person is from Punjab( State in India) and is 26 years old, then there is a high chance of him ordering Paratha(Sorry if I am typecasting :P)
So, you already have the historic data and most importantly you know the output for each row of data. Using this historic data you created a model which learns and makes a recommendation in the real time. This whole process is based on the fact that “The model creates a set of rule which enables it to understand the nature of the data and it can then use these set of rules for further prediction”
Interestingly most of the work you will do in your Data Science job will revolve around Supervised Learning.
Supervised learning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output.
Y = f(X)
The goal is to approximate the mapping function so well that when you have new input data (x) that you can predict the output variables (Y) for that data.
It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process. We know the correct answers, the algorithm iteratively makes predictions on the training data and is corrected by the teacher. Learning stops when the algorithm achieves an acceptable level of performance
The most important Supervised Learning algorithms are:- 1. Support Vector Machines 2. Linear Regression 3. Logistic Regression 4. Naive Bayes 5. Linear Discriminant Analysis (LDA) 6. Decision Tree 7. K-Nearest Neighbor 8. Neural Network 9. Similarity Training
You will learn about each of these algorithms one by one, but first let’s look into the process involved in building these models
Step 1 – Gather your data Step 2 – Clean the data. It will occupy a lot of your time Step 3 – Feature Engineering. You might need to create or derive new features from the already present data set. The input object is transformed into a feature vector, which contains a number of features that are descriptive of the object. Step 4 – Determine which algorithm you want to implement on your data set Step 5 – Run the model on the training data set. Some supervised learning algorithms require the user to determine certain control parameters. These parameters may be adjusted by optimizing performance on a subset (called a validation set) of the training set, or via cross-validation. Step 6 – Evaluate the performance or accuracy of the model. If everything is fine, then run the model on the test dataset
Above we saw the list of Supervised Learning Algorithms. Supervised Learning problems can further be divided into two categories:- a. Classification – A classification problem is such where the output variable is a categorical variable. If you are predicting different disease on the basis of symptoms, then that will fall under Classification
b. Regression – Regression is used when you need to predict continuous values like Number of customers coming to a restaurant, the number of visitors on a website, etc.
Some of the applications of Supervised Learning:-
1. Use a predictive algorithm to find out which student will get how much marks 2. Use Logistic Regression to find out which customer will in-cash his insurance policy 3. Predicting prices of House 4. Weather forecasting 5. Classification of emails (Spam and non-spam) 6. In supervised learning for image processing, for example, an AI system might be provided with labeled pictures of vehicles in categories such as cars and trucks. After a sufficient amount of observation, the system should be able to distinguish between and categorize unlabeled images, at which time training can be said to be complete.
Supervised Learning is like learning from a teacher. He will teach you the ways to answer questions and will evaluate your learning. You can expect the same types of questions to appear in the examination i.e. your testing condition. And you answer according to your understanding. Your marks is your accuracy.
We will use Python to train our Supervised Learning algorithm in the next few Days.
Big Data, Machine Learning, Artificial Intelligence, etc. If you are regular with news, then you must have heard a lot about these words. Let’s try to understand things with examples
What is Big Data? In 1990s the size of data used to be small and people used to store only relevant data points. With the World Wide Web(WWW) boom, data became omnipresent. There was a way to store a good amount of data in Excel files and other applications. But the major change happened with the advancement in mobile technologies. Smartphones came up with a lot of data. Every application and website is storing a plethora of data ranging from your personal to professional information. Almost all the clicks you make on the internet are being stored somewhere in the word. When you are working with a lot of data, then that data is termed as Big Data.
So, it’s not like Big Data is a new concept. It’s just that the size of data increased multiple times and in order to store these data, we needed new tools and technologies. All this complete eco-system is called Big Data.
Now, what is Machine Learning? Machine Learning is a way to train a machine to start learning from the user’s behavior and then provide useful information or take actions accordingly. You can see Machine Learning examples around you.
1. You click an advertisement on Google and the next day you get similar ads. This is because your interest was tagged in this brief span of time and now you are bombarded with the advertisements.
2. Ever heard of Driverless cars? Can you even imagine the rate at which the back-end algorithms need to work in order to identify an object and taking actions accordingly? The margin of error is almost zero because we are talking about real life. This is where image recognition and several different algorithms come into the picture.
3. Machine Learning is learning from data, on the other hand, Artificial Intelligence is a buzz word. There are so many problems which you can solve using machine learning. You will understand the capabilities of this domain in the coming Days
4. 10 Years back Software Engineers used to work on these predictive models, clustering and classifying data, etc. But as the amount of data started increasing, handling data and getting insights from these data because difficult. This gave rise to new job opportunities which go by the name of Data Scientist, Data Analyst, Decision Scientist, Big Data Analyst, etc. So, this thing is not new, it just got scaled up
5. Most of the hard work for machine learning is data transformation. From reading the hype about new machine learning techniques, you might think that machine learning is mostly about selecting and tuning algorithms. The reality is more prosaic: most of your time and effort goes into data cleansing and feature engineering — that is, transforming raw features into features that better represent the signal in your data.
6. AI is not going to become self-aware, rise up, and destroy humanity. A surprising number of people seem to be getting their ideas about artificial intelligence from science fiction movies. We should be inspired by science fiction, but not so credulous that we mistake it for reality. There are enough real and present dangers to worry about, from consciously evil human beings to unconsciously biased machine learning models. So you can stop worrying about skynet and superintelligence .
7. ML is a computer science discipline that consists in making computers “learn” from data rather than programming instructions. For example, imagine you had to implement a gender (male vs female) recognition software. If you had to implement this in the traditional way, you would need to extract features that would help you decide. Then, you would write a lot of code to instruct the computer how to use these features. Unfortunately, this approach is tedious and not robust enough. On the other hand, the ML approach consists in collecting lots of images and labeling them. Then, running an ML algorithm that will learn the task by observing the data. By the way, this approach is called supervised learning.
8. ML is an evolving and exciting field. Many jobs exist and many more will. It is the modern form of literacy in our technological and data-driven society. Learn about it as much as you can.
9. You can very well make a career in Machine Learning and Data Science. You just have to practice playing with data and understanding the data. In my personal opinion, Machine Learning is here to stay, so it’s better if you take some time to understand it
Go through all the overview Days in this challenge, pick up a few and then gain expertise in a couple of these.
There are various statistical tests in Data Analysis, following are the tests and their use:- 1. Correlation -> This test looks at the association between variables 2. Pearson Correlation -> It tests the strength of the association between two continuous variables 3. Spearman correlation -> It tests the strength of association between two ordinal variables 4. Chi-Square -> It tests for the strength between two categorical variables
Comparison of Means – The below tests looks for the difference between the means of variables
1. Paired T-test ->Test for difference between two relatable variables 2. Independent T-Test -> Test for difference between two related variables 3. ANOVA -> It stands for Analysis of Variance. It is a statistical method used to test differences between two or more means.
Regression – It assess if change in one variable predicts change in another variable
1. Simple Regression – Tests how change in the predictor variable predicts the level of change in the outcome variable 2. Multiple Regression – Tests how change in the combination of two or more variables predict the level of change in the outcome variable
Non-Parametric – These tests are used when the data does not meet assumptions required for parametric tests
1. Wilcoxon rank-sum test -> Tests for difference between two independent variables, takes into account magnitude and direction of difference
2. Wilcoxon sign-rank test -> Test for difference between two related variables, takes into account magnitude and direction of difference
3. Sign test -> Tests if two related variables are different – ignores magnitude of change, only takes into account direction
We will have a separated blog for all the tests separately. Till then google.
This is a cheat sheet which aims on giving all the important concepts in a very crisp manner. Just give it a read before starting a new project in R or interviewing for a Data Science or Data Analyst or Business Analyst post. There are various advantages and disadvantages of using R over Python, but we will not dig deep into it. This is a cheat sheet, so if you need more help, there is this awesome website www.google.com
We will start directly with Data Types which are the building blocks of a programming language.
There are 6 object types supported in R:- 1. Vectors 2. Lists 3. Matrices 4. Arrays 5. Factors 6. Data Frames
There are 6 data types of these objects:- 1. Logical – TRUE, FALSE 2. Numeric – 56.4, 45.3 3. Integer – 1,2,3,4 4. Complex – 6+2i 5. Character – “the”, “data”, “monk” 6. Raw – Any string or anything
Let’s briefly look into each Object types:- 1. Vector
If you want to create a vector with different data types then you have to use c() to define the vector
The only thing worth mentioning here is that when you use a negative index then that index will be ignored. See the example above
2. List Vector was the simplest object in R, next in the list is a List 😛
Let’s create a list which will include a list, a vector, and a matrix. If you don’t know much about matrix, just remember it’s a 2-dimensional object which is defined as x<-matrix(c(1,2,3,4,5,6),nrow=2) to create a matrix with 2 rows.
Give names to the elements of list. See the example below
Merge two lists
Converting a list to a vector i.e. unlisting a list
Basically all you need to know about list are:- 1. How to create a list (x <- list()) 2. What all can a list include? (Anything ranging from a vector to arrays, matrix, etc.) 3. Giving name to each element of the list (use the function names()) 4. Accessing elements of the list (Use )
Matrix A matrix is a 2-dimensional object, you need to specify the number of rows and columns, and dim names while declaring a matrix. Here dim names are the names given to the rows and columns 😛
x <- matrix(c(1,2,3,4,56,23),ncol=2,nrow=3,dimname=list(rownames,column names)
I think you must have got a gist of a matrix. You can definitely create two matrices and apply arithmetic operations like addition, subtraction, etc. Matrix multiplication is also simple only, see the examples below
Accessing elements in a matrix
Arrays Arrays are able to store more than 2 dimensions in itself. Vector one dimensional, list is 2-dimensional, and now this array is more than 2 dimensional. God knows where this programming language is going 😛
You already know how to add x-labels, y-labels, title, etc.
Go ahead and add these in the graph above
Box and Whisker Plot
A box and whisker plot, or boxplot for short, is generally used to summarize the distribution of a data sample. The x-axis is used to represent the data sample, where multiple boxplots can be drawn side by side on the x-axis if desired.
Box plot is one of the most common type of graphics. It gives a nice type of summary of one or more numeric variables. The line that divides the box in the two half is the median of the numbers. The end of the boxes represents
seed(123) a = random.sample(range(1,100),20) b = random.sample(range(1,100),20) c = random.sample(range(1,100),20) d = random.sample(range(1,100),20) list_Ex = [a,b,c,d] plt.boxplot(list_Ex)
Graph 17 – A basic Box-Whisker graph
Now we will try to make the graph look better by adding
color to the plot. The box-plot shows median, 25th and 75th
percentile, and outliers. You should try to give different color to these
points to make the plot more appealing.
When you plot a boxplot, you can use the following 5 attributes of the plot:- a. box – To modify the color, line
width, etc. of the central box b. whisker – To modify the color and
line width of the line which connects the box to the cap i.e. the horizontal
end of the box plot c. cap – The horizontal end of the
box d. median – The center of the box e. flier
The box denotes the 1st and 3rd Quartile and it is called
IQR i.e. the Inter Quartile Range. The lower fence is at Q1 – 1.5*IQR and the
upper fence is at Q3 + 1.5*IQR. Any point which falls above or below it is
called fliers or outliers
Following is the code with some fancy colors to help you understand each term individually.
bp=plt.boxplot(list_Ex,patch_artist = True) for box in bp[‘boxes’]: box.set(color=’orange’,linewidth=2 for whisker in bp[‘whiskers’]: whisker.set(color = ‘red’,linewidth=2) for cap in bp[‘caps’]: cap.set(color=’green’,linewidth=2) for median in bp[‘medians’]: median.set(color=’blue’,linewidth=2) for flier in bp[‘fliers’]: flier.set(marker=’o’,color = ‘black’, alpha=0.5)
Graph 18 –Box Whisker Chart
Following is one more code with the help of which you can replicate a Gaussian
from numpy.random import seed
from numpy.random import randn
from matplotlib import pyplot
# random numbers drawn from a Gaussian distribution
x = [randn(1000), 5 * randn(1000), 10 * randn(1000)]
# create box and whisker plot
# show line plot
Graph 19 – A Box-Whisker Plot
Scatter plot is an easy to make but interesting visualization which gives a clear picture of how the data is distributed.
Let’s take example of 10 innings played by Sachin, Dhoni, and Kohli and see how their scores are distributed. The code is fairly easy to understand
You can also add legend in the plot by using the following command
legend = [‘sachin’,’kohli’,’dhoni’] plt.legend(legend) The plot will now look like this
Graph 20 – A scatter plot
is one more scatter plot where you give weighted area and the size of the
circle will be on the basis of the circle
import numpy as np
x = random.sample(range(1,100),40)
y = random.sample(range(1,100),40)
colors = np.random.rand(N)
area = (30*np.random.rand(N))**2
Graph 21 – A scatter plot with area of bubble denoting the volume
Create a pie chart for the number of centuries scored by Sachin, Dhoni, Dravid,
size = [100,25,70,50]
colors = [‘pink’,’blue’,’red’,’orange’]
explode = (0.1,0,0,0)
plt.show() explode is used to set apart the first part of the pie chart. Everything
else in the code is self explanatory. Below is the plot
Graph 22 – Pie chart showing performance of cricketers
Some cool Visualizations
Create a stacked chart to demonstrate the number of people voting for either Python or Java in 5 countries, namely, India, USA, England, S.A., Nepal
import numpy as np import matplotlib.pyplot as plt Python = (20, 35, 30, 35, 27) Java = (25, 32, 34, 20, 25) width = 0.35 # the width of the bars: can also be len(x) sequence p1 = plt.bar(ind, Python, width) p2 = plt.bar(ind, Java, width,bottom=Python) plt.ylabel(‘Votes’) plt.title(‘Number of people using Python or Java’) plt.xticks(ind, (‘India’, ‘USA’, ‘England’, ‘S.A.’, ‘Nepal’)) plt.yticks(np.arange(0, 81, 10)) plt.legend((p1, p2), (‘Python’, ‘Java’)) plt.show()
xticks is used to give labels to the x-axis and yticks give labels to the y-axis.
Graph 23 – Stacked Bar graph
cool area graph
import numpy as np
import matplotlib.pyplot as plt
# create data
# Change the color and its transparency
plt.fill_between( x, y, color=”red”, alpha=0.4)
# Same, but add a stronger line on top (edge)
plt.fill_between( x, y, color=”red”, alpha=0.2)
plt.plot(x, y, color=”red”, alpha=0.6)
The parameter alpha is used to give weight age to the density of color. 0.4
is given to the edge and 0.2 is given to the fill
Graph 24 – An area graph
One of the most important thing is to understand when to use which graph and a list of all the graphs in your knowledge.
There are four types of information which we can display using any plot:- 1. Distribution 2. Comparison 3. Relationship 4. Composition
1. Distributionshows how diversely the data is distributed in your data set. How many people are from which state of the country?
a Histogram – If you have few data point b. Line Histogram – When you have a lot of data points c. Scatter plot – When you have to show the distribution of 2-3 variables
2. Comparison – When you have to compare something over 2 or more categories
a. Variable width chart – When you have to compare two variables per item b. Tables with embedded charts – When there are many categories, basically a matrix of charts c. Horizontal or Vertical Histogram – When there are few categories in a data set d. If you want to compare something over time i. Line Chart ii. Bar Vertical Chart iii. Many categories line chart
3. Relationship Charts – When you want to see the relationship between
two or more variables then you have to use relationship charts
a. Scatter Plot
b. Scatter plot bubble chart
4. Composition Charts – When you have to show a percentage or composition
a. Pie Chart – Very basic plot when
there are 3-6 categories b. Stacked 100% bar chart with sub
component – When you have to show components of components c. Stacked 100% bar chart – When you
have to look into the contribution of each component. d. Stacked area chart – When
relative and absolute difference matters
Data visualization is the discipline of trying to understand data by placing it in a visual context so that patterns, trends and correlations that might not otherwise be detected can be exposed. It is one of the basic but a very important weapon in your Data Science career. Python is blessed with some good libraries for visualizations.
Open Jupyter notebook or any other IDE of your preference.
Library to use – There are a lots of good visualization libraries, but matplot library is the most preferred one to start with because of its simple implementation. So,We will mostly concentrate on matplot library.
Importing the library and giving it the standard alias as plt.
Following are the two important functions which will come
handy in this book:-
To display a chart you should use – plt.show()
To save the chart as an image, use the code – plt.savefig(“Filename.png”)
Popular plotting libraries in Python are:-
1. Matplotlib – Best to start with.
It provides easy implementation and gives a lot of freedom 2. Seaborn – It has a high level
interface and great default styles 3. Plotly – To create interactive
plots 4. Pandas Visualization – Easy
interface, built on Matplotlib
A line chart or line graph is a type of chart which displays information as a series of data points called ‘markers’ connected by straight line segments.
So, a line plot is a very basic plot which is used to show observations collected after a regular interval. The x-axis represents the interval and the y-axis represents the values.
Lets plot our first graph import matplotlib.pyplot as plt x = [1,2,3,4,5,6] y = [10,12,20,21,30,35] plt.plot(x,y) Here is what you will get
Graph 1 – Basic Line Chart
Plot a sin graph using line plot
import matplotlib.pyplot as plt from numpy import cos
x = [x*0.01 for x in range(100)] y = cos(x) plt.plot(x,y) plt.show()
Here is what you get as a cos graph
Graph 2 – Cos graph using line plot
You know how to plot a line graph, but there is one important thing missing in the graph i.e. the x and y-axis, and the plot title. Let’s create another line plot for number of students in a class for the following data
c = [1,2,3,4,5,6] student = [40,52,50,61,70,78]
Following commands are used to put x-axis label, y-axis label, and chart title
The code is given below c = [1,2,3,4,5,6] student = [40,52,50,61,70,78] plt.xlabel(“Class”) plt.ylabel(“Number of Students”) plt.title(“Class vs Number of students”) plt.plot(c, student)
Graph 3 – Class vs Number of Students chart with proper labels and plot title
Do you want to change the color of the line? Try the following code instead to make the line green in color plt.plot(c,student,color=’g’)
Graph 4 – Adding color to the same graph
Multi Line Chart
You can also add multiple plots in the same graph. Let’s try to put a couple of new lines in the graph i.e. number of teachers and average marks
Graph 5 – Adding multiple lines to a graph
To add a legend, you have to give label to each of the line which you want to plot and after that you specify a location to the legend
code is self explanatory and is given below:-
c = [1,2,3,4,5,6]
student = [40,52,50,61,70,78]
avg_marks = [34,43,54,44,50,55]
num_of_teachers = [10,12,13,10,15,10]
#plt.ylabel(“Number of Students”)
plt.title(“Class vs Number of students”)
“A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally.”
After the line chart, the second basic but highly used chart is the bar chart
To create a bar chart – plt.bar(x,y)
We will plot few graphs first and then you can put labels, title, and legends later.
import matplotlib.pyplot as plt a = [‘Apple’,’Mango’,’Pineapple’] b = [40,60,50] plt.bar(a,b)
Graph 6 – A simple bar chart
Use random values between 1 and 100 to create the same graph.
import matplotlib.pyplot as plt from random import seed from random import randint seed(123) x = [‘Apple’,’Mango’,’Pineapple’] y = [randint(0,100),randint(0,100),randint(0,100)] plt.bar(x,y)
Graph 7 – Bar chart with random values
Adding color, labels, and title to the random values bar chart
Stacked 100% bar chart with sub component When you have to show components of components like the graph below
Histograms are density estimates. A density estimate gives a good impression of the distribution of the data. The idea is to locally represent the data density by counting the number of observations in a sequence of consecutive intervals (bins).
To plot a histogram use this code – plt.hist(x,y)
A simple histogram plot q = [1,2,34,5,44,66,66,90,33,45,2,1,2,3,4] plt.hist(q,bins = 3,color=’green’)
Graph 9 – A simple histogram
Create a list using random variables and plot it in 4 bins
import random my_rand = random.sample(range(1,30),20) print(my_rand) print(type(my_rand)) plt.hist(my_rand,bins=4,color=’orange’)
Graph 10 – A histogram made with random variables
In Histogram also you can add more than one data points to make parallel bars.
import numpy as np import matplotlib.pyplot as plt name = [‘Nitin’,’Saurabh’,’Rahul’,’Gaurav’,’Amit’] run = [200,70,130,120,100] plt.barh(name,run,color=’orange’) plt.xlabel(“Runs Scored”) plt.ylabel(“Cricketer”) plt.title(“Runs scored by cricketers”) plt.show()
Graph 12 – A horizontal histogram
Keep making irrelevant and unnecessary graphs. Keep practicing 🙂
You always have to read and write files when working for a company or Hackathon. So, it’s necessary to know how to read different types of files.
Let’s start the boring but important part
The most important command to open a file in Python is the open() method. It takes two parameters, Name of the file and action mode.
Like most of the other programming languages, Python has 4 modes to access a file:- 1. “r” – Read – Reads a file 2. “a” – Append – Appends a file or create a new file 3. “w” – Write – Writes a new file 4. “x” – Create – Creates the specified file
Apart from these you can also specify the format in which you want to open the file: 1. t for Text(Default) 2. b for Binary file
Open a file x = open(“Analytics.txt”,”rt”) It opens the file, basically reads it in text format
Read the file
You can also read the file line by line by the following method or by using readline() method
Write something in a file
Delete a file
Use the “os” package and then run the remove() command import os os.remove(“file name”)
God forbid, if you ever have to delete a folder and want to look cool in front of your friends, you can use the following command
os.rmdir(“Name of directory”)
Reading CSV file Comma Separated Values or CSV file format is one of the most used file formats and you will definitely come across reading a csv file often. In order to read it, you should ideally import pandas library
There are a lot of file formats, but we covered only those which are of utmost important. In case you need more information, try this link from Data Camp or you can trust your best friend StackOverFlow 😛
If you need information about a specific file format, do comment below.
The reason why I put interview questions as the title of a lot of posts is because:– 1. It makes you click on the post 2. It makes you feel that these are very important questions and you can nail an interview with it 3. These are actual interview questions asked in companies like Myntra, Flipkart, BookMyShow, WNS, Sapient, etc. 4. You have to practice to become perfect. You can practice here or anywhere else. But make sure you know all the questions given below.
Toh surukartehainbinakisibakchodike Let’s start with the questions 😛
1. Which data type is mutable and ordered in Python? List
2. Can a dictionary contain another dictionary? Yes, a dictionary can contain another dictionary. In fact, this is the main advantage of using dictionary data type.
3. When to use list, set or dictionaries in Python? A list keeps order, dict and set don’t: When you care about order, therefore, you must use list (if your choice of containers is limited to these three, of course;-).
dict associates with each key a value, while list and set just contain values: very different use cases, obviously. set requires items to be hashable, list doesn’t: if you have non-hashable items, therefore, you cannot use set and must instead use list.
4.WAP where you first create an empty list and then add the elements. basic_list =  basic_list.append(‘Alpha’) basic_list.append(‘Beta’) basic_list.append(‘Gamma’)
5. What does this mean: *args, **kwargs? And why would we use it? We use *args when we aren’t sure how many arguments are going to be passed to a function, or if we want to pass a stored list or tuple of arguments to a function. **kwargsis used when we don’t know how many keyword arguments will be passed to a function, or it can be used to pass the values of a dictionary as keyword arguments. The identifiers args and kwargs are a convention, you could also use *bob and **billy but that would not be wise.
6. What are negative indexes and why are they used? The sequences in Python are indexed and it consists of the positive as well as negative numbers. The numbers that are positive uses ‘0’ that is uses as first index and ‘1’ as the second index and the process goes on like that.
7. Randomly shuffle the content of a list
8. Take a random sample of 20 elements and put it in a list
9. Take a list and sort it
10. Explain split() and sub() function from the “re” package split() – uses a regex pattern to “split” a given string into a list sub() – finds all substrings where the regex pattern matches and then replace them with a different string
11. What are the supported data types in Python? The most important data types include the following: 1. Number 2. String 3. List 4.Tuple 5. Dictionary 6. Set
12. What is the function to reverse a list? list.reverse()
13. How to remove the last object from the list? list.pop(obj=list[-1]) Removes and returns last object or obj from list.
14. What is a dictionary? A dictionary is one of the built-in data types in Python. It defines an unordered mapping of unique keys to values. Dictionaries are indexed by keys, and the values can be any valid Python data type (even a user-defined class). Notably, dictionaries are mutable, which means they can be modified. A dictionary is created with curly braces and indexed using the square bracket notation.
15. Python is an object oriented language. What are the features of an object oriented programming language? OOP is the programming paradigm based on classes and instances of those classes called objects. The features of OOP are: Encapsulation, Data Abstraction, Inheritance, Polymorphism.
16. What is the difference between append() and extend() method? Both append() and extend() methods are the methods of list. These methods a re used to add the elements at the end of the list. append(element) – adds the given element at the end of the list which has called this method. extend(another-list) – adds the elements of another-list at the end of the list which is called the extend method.
17. Write a program to check if a string is a palindrome? Palindrome is a string which is symmetric like. aba, nitin, nureses run, etc
Below is the code, write it down yourself 😛
18. Take a random list and plot a histogram with 3 bins.
19. What is the different between range () and xrange () functions in Python? range () returns a list whereas xrange () returns an object that acts like an iterator for generating numbers on demand.
20. Guess the output of the following code x = “Fox ate the pizza” print(x[:7])
You can find Python interview questions on many websites, we will keep on updating this list. Time for some marketing, if you want to get some more interview questions on Python, then click below:-