Data visualization is the discipline of trying to understand data by placing it in a visual context so that patterns, trends
Python is blessed with some good libraries for visualizations.
Open Jupyter notebook or any other IDE of your preference.
Library to use – There are
So
Importing the library and giving it the standard alias as
Following are the two important functions which will come
handy in this book:-
To display a chart you should use – plt.show()
To save the chart as an image, use the code – plt.savefig(“Filename.png”)
Popular plotting libraries in Python are:-
1. Matplotlib – Best to start with.
It provides easy implementation and gives a lot of freedom
2. Seaborn – It has a high level
interface and great default styles
3. Plotly – To create interactive
plots
4. Pandas Visualization – Easy
interface, built on Matplotlib
Line Chart
A line chart or line graph is a type of chart which displays information as a series of data points called ‘markers’ connected by straight line segments.
So, a line plot is a very basic plot which is used to show observations collected after a regular interval. The x-axis represents the interval and the y-axis represents the values.
Lets plot our first graph
import matplotlib.pyplot as plt
x = [1,2,3,4,5,6]
y = [10,12,20,21,30,35]
plt.plot(x,y)
Here is what you will get
Graph 1 – Basic Line Chart
Plot a sin graph using
import matplotlib.pyplot as
from
x = [x*0.01 for x in range(100)]
y = cos(x)
plt.plot(x,y)
plt.show()
Here is what you get as a cos graph
Graph 2 – Cos graph using line plot
You know how to plot a line graph, but there is one important thing missing in the graph i.e. the x and y-axis, and the plot title. Let’s create another line plot for
c = [1,2,3,4,5,6]
student = [40,52,50,61,70,78]
Following commands are used to put x-axis label, y-axis label, and chart title
plt.xlabel(“Label”)
plt.ylabel(“Label”)
plt.title(“Title”)
The code is given below
c = [1,2,3,4,5,6]
student = [40,52,50,61,70,78]
plt.xlabel(“Class”)
plt.ylabel(“Number of Students”)
plt.title(“Class vs Number of students”)
plt.plot(c, student)
Graph 3 – Class vs Number of Students chart with proper labels and plot title
Do you want to change the color of the line?
Try the following code instead to make the line green in color
plt.plot(c
Graph 4 – Adding color to the same graph
You can also add multiple plots in the same graph. Let’s try to put a couple of new lines in the graph i.e. number of teachers and average marks
Graph 5 – Adding multiple lines to a graph
To add a legend, you have to give
The
code is self explanatory and is given below:-
c = [1,2,3,4,5,6]
student = [40,52,50,61,70,78]
avg_marks = [34,43,54,44,50,55]
num_of_teachers = [10,12,13,10,15,10]
plt.xlabel(“Class”)
#plt.ylabel(“Number of Students”)
plt.title(“Class vs Number of students”)
plt.plot(c,student,color=’orange’,label=’Student’)
plt.plot(c,avg_marks,color=’red’,label=’Marks’)
plt.plot(c,num_of_teachers,color=’green’,label=’Teachers’)
plt.legend(loc=”upper left”)
Bar Chart
“A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally.”
After the line chart, the second basic but
To create a bar chart – plt.bar(x,y)
We will plot
import matplotlib.pyplot as plt
a = [‘Apple’,’Mango’,’Pineapple’]
b = [40,60,50]
plt.bar(a,b)
Graph 6 – A simple bar chart
Use random values between 1 and 100 to create the same graph.
import matplotlib.pyplot as plt
from random import seed
from random import randint
seed(123)
x = [‘Apple’,’Mango’,’Pineapple’]
y = [randint(0,100),randint(0,100),randint(0,100)]
plt.bar(x,y)
Graph 7 – Bar chart with random values
Adding color, labels, and title to the random values bar chart
Stacked 100% bar chart with
When you have to show components of components like the graph below
Example of 100% bar chart
x =
[“a”,”b”,”c”,”d”]
y1 = np.array([3,8,6,4])
y2 = np.array([10,2,4,3])
y3 = np.array([5,6,2,5])
snum = y1+y2+y3
# normalization
y1 = y1/snum*100.
y2 = y2/snum*100.
y3 = y3/snum*100.
plt.figure(figsize=(4,3))
# stack bars
plt.bar(x, y1, label=’y1′)
plt.bar(x, y2 ,bottom=y1,label=’y2′)
plt.bar(x, y3 ,bottom=y1+y2,label=’y3′)
Graph 8 – A 100% stacked bar chart
Histogram
Histograms are density estimates. A density estimate gives a good impression of the distribution of the data. The idea is to locally represent the data density by counting the number of observations in a sequence of consecutive intervals (bins).
To plot a histogram use this code –
A simple histogram plot
q = [1,2,34,5,44,66,66,90,33,45,2,1,2,3,4]
plt.hist(q,bins = 3,color=’green’)
Graph 9 – A simple histogram
Create a list using random variables and plot it in 4 bins
import random
my_rand = random.sample(range(1,30),20)
print(my_rand)
print(type(my_rand))
plt.hist(my_rand,bins=4,color=’orange’)
Graph 10 – A histogram made with random variables
In Histogram also you can add more than one data points to make parallel bars.
import random
my_rand = random.sample(range(1,30),20)
my_rand2 = random.sample(range(1,25),20)
print(my_rand)
print(type(my_rand))
plt.hist([my_rand,my_rand2],bins=4,color=[‘green’,’red’])
legend = [‘Rand1′,’Rand2’]
plt.legend(legend)
plt.xlabel(“Bins”)
plt.ylabel(“Random Number”)
plt.title(“Random Variable distribution”)
Graph 11 – Parallel histogram
Horizontal Histogram
import numpy as np
import matplotlib.pyplot as plt
name = [‘Nitin’,’Saurabh’,’Rahul’,’Gaurav’,’Amit’]
run = [200,70,130,120,100]
plt.barh(name,run,color=’orange’)
plt.xlabel(“Runs Scored”)
plt.ylabel(“Cricketer”)
plt.title(“Runs scored by cricketers”)
plt.show()
Graph 12 – A horizontal histogram
Keep making irrelevant and unnecessary graphs.
Keep practicing 🙂
XtraMous