What is Matplotlib in Python?
What is Matplotlib in Python? Well, you’ll need to read on in order to get an answer to that question. Handling of data is a skillful art. In the trending technological world, there is a massive amount of data that is being consumed, as well as wasted. Thus, handling this data in a rather effective manner becomes the main goal of Data Science. We can make use of various programming languages to deal with the data sets that require operations, like calculating statistics, sales, marketing, plotting on graphical platforms, etc., to be done on them.
The following content will enable us to get a detailed view on how data can be plotted using Python Matplotlib:
- Description on Matplotlib Python
- Python Matplotlib vs MATLAB
- Syntax with a Basic Example
- Common Terminologies
- Plot Manipulation Description
- Ways of Plotting (Types)
Description on Matplotlib Python
Plotting of data can be extensively made possible in an interactive way by Matplotlib, which is a plotting library that can be demonstrated in Python scripts. Plotting of graphs is a part of data vistualization, and this property can be achieved by making use of Matplotlib.
Matplotlib makes use of many general-purpose GUI toolkits, such as wxPython, Tkinter, QT, etc., in order to provide object-oriented APIs for embedding plots into applications. John D. Hunter was the person who originally wrote Matplotlib, and its lead developer was Michael Droettboom. One of the free and open-source Python library which is basically used for technical and scientific computing is Python SciPy. Matplotlib is widely used in SciPy as most scientific calculations require plotting of graphs and diagrams.
Python(Matplotlib) vs. MATLAB
Python Programming | MATLAB |
It is an open-source programming language, free to use. | MATLAB is a commercial platform. Hence, it is not free. |
Matplotlib is more flexible and capable for plotting. | Plotting is comparatively not as flexible and capable as Python plotting. |
Python provides a large number of libraries to work with. | It is tricky to add libraries and work with them in MATLAB. |
Python is an easy-to-read and powerful programming language. | MATLAB is not as powerful as Python. |
Matplotlib plotting is faster in Python. | Plotting of data in MATLAB requires time and effort. |
Integrated development environment (IDE) needs to be added, additionally. | IDE will be provided within the MATLAB environment. |
Code can be used in multiple systems. It is portable. | Code portability is restricted. |
Namespace is supported in Python. | Core of MATLAB does not support namespace. |
Syntax of Matplotlib Python with a Basic Example
Importing matplotlib.pyplot as pltPyplot is basically used for plot or figure manipulation.
Matplotlib.pyplot enables Python Matplotlib to operate just like MATLAB. Lets see How to import matplotlib in python.
Python Matplotlib Example:
plt.plot([1,1])
plt.plot([2,2])
plt.plot([3,3])
The graph can be used to plot three straight lines. We make this possible by using the plotting library, Matplotlib.
Watch this Python Interview Questions video
Common Terminologies
- Plot: It is an illustration that can be represented using a graph.
Import numpy as np
plt.plot([1,1])
When we take the plot parameters as [1,1], we get the above represented plot as the output.
- Figure: It is a diagram or a shape that can be formed by a collection of plots in different dimensions.
Example for figure():
import numpy as np
plt.figure(1)
plt.plot([1,1])
plt.figure(2)
plt.plot([1,2])
Figure(1) helps print the first graph with plot([1,1]), and figure(2) helps print the second graph with plot([1,2]).
- Label: It is used to add labels or names to respective x and y axes.
- Title: It is used to display the title of the graph.
Example for label() and title():
import matplotlib.pyplot as plt
plt.plot([1,2,1,2])
plt.title(‘GRID REPRESENTATION’)
plt.xlabel(‘X-axis’)
plt.ylabel(‘Y-axis’)
In the above graph, the horizontal axis is labeled as ‘X-axis’ and the vertical axis is labeled as ‘Y-axis’, and the title is displayed as ‘GRID REPRESENTATION’.
- Grid: It is a collection of objects and functions which is concerned with 3-dimensional data.
Example for grid():
import matplotlib.pyplot as plt
plt.plot([1,2,1,2])
plt.xlabel(‘X-axis’)
plt.ylabel(‘Y-axis’)
plt.grid()
A grid-based representation is displayed in the above output, and it helps locate specific regions in the graph.
6. Subplot: A subplot() function can be called to plot multiple plots in the same figure.
Example for subplot():
import numpy as np
plt.subplot(2,1,1)
plt.plot([1,4])
plt.subplot(2,1,2)
plt.plot([2,2])
The above representation explains how subplots are obtained and how two subplots are plotted in the same figure.
Plot Manipulation Description
- Plot creation: This depends on the type of module that can be used in Python. Creating a plot is the key aspect of plotting where we decide the plot upon which a figure is constructed. Figure and axes initialization is also carried out under plot creation.
- Plot routines: Visualization techniques of viewing data from the simplest to the advanced forms are part of plot routines
- Plot customization: This includes adding of plot titles, legends, axes labels, layouts, etc.
- Manipulation of plots also includes saving of plots, clearing the content, displaying figures, clearing axes, etc.
- Images, colors, and text are some of the best features that can be included within the plot.
Matplotlib Python Plotting Ways (Types)
There are various plotting techniques or ways that can be carried out on the data provided, and some of these plotting types are as follows:
Line Plot
The plotting of the frequency of data along a line can be represented using a line plot. It is one of the simplest and commonly used plotting methods. Line plotting is a primitive plotting technique as it is a plotting method that was first introduced.
Let us now look at a real-time scenario:
Consider that a survey has to be done on how much distance the following vehicles have covered in a span of five days. The data collected can be plotted in different plotting methods.
We will make use of Jupyter Notebook to run the codes to represent the following data in plots.
BIKES | ||||
DAYS | DISTANCE COVERED IN KMS | |||
ENFIELD | HONDA | YAHAMA | KTM | |
DAY 1 | 50 | 80 | 70 | 80 |
DAY 2 | 40 | 20 | 20 | 20 |
DAY 3 | 70 | 20 | 60 | 20 |
DAY 4 | 80 | 50 | 40 | 50 |
DAY 5 | 20 | 60 | 60 | 60 |
Example for a line plot:
x = [1,2,3,4,5]
y = [50,40,70,80,20]
y2 = [80,20,20,50,60]
y3 = [70,20,60,40,60]
y4 = [80,20,20,50,60]
plt.plot(x,y,’g’,label=’Enfield’, linewidth=5)
plt.plot(x,y2,’c’,label=’Honda’,linewidth=5)
plt.plot(x,y3,’k’,label=’Yahama’,linewidth=5)
plt.plot(x,y4,’y’,label=’KTM’,linewidth=5)
plt.title(‘bike details in line plot’)
plt.ylabel(‘ Distance in kms’)
plt.xlabel(‘Days’)
plt.legend()
Various lines present in the above graph have unique colors, and each of them denotes details of different bikes. The line representing Honda is overwritten by the line representing KTM, since both vehicles have covered the same distance in their respective days.
Bar Chart Plot
Categorical data can be represented in rectangular blocks with different heights or lengths proportional to the values. Such a type of representation is called a bar chart. Bar charts can be used to plot data in both vertical and horizontal manner.
Example for a bar plot:
plt.bar([0.25,1.25,2.25,3.25,4.25],[50,40,70,80,20],
label=”Enfield”,width=.5)
plt.bar([0.26,1.25,2.25,3.25,4.25],[80,20,20,50,60],
label=”Honda”, color=’r’,width=.5)
plt.bar([0.31,1.5,2.5,3.5,4.5],[70,20,60,40,60],
label=”Yamaha”, color=’y’,width=.5)
plt.bar([.75,1.75,2.75,3.75,4.75],[80,20,20,50,60],
label=”KTM”, color=’g’,width=.5)
plt.legend()
plt.xlabel(‘Days’)
plt.ylabel(‘Distance (kms)’)
plt.title(‘Bikes details in BAR PLOTTING’)
The above plotting shows the bar representation of the given scenario where the bikes are symbolized using different colored blocks, and each block shows the distance covered by the respective bikes on every particular day for a period of five days.
Area Plot
This type of plotting is basically used for quantitative data. A line chart forms the basis of an area plot, where the region between the axis and the line is represented by colors.
Example for an area plot:
days = [1,2,3,4,5]
Enfield =[50,40,70,80,20]
Honda = [80,20,20,50,60]
Yahama =[70,20,60,40,60]
KTM = [80,20,20,50,60]
plt.plot([],[],color=’k’, label=’Enfield’, linewidth=5)
plt.plot([],[],color=’c’, label=’Honda’, linewidth=5)
plt.plot([],[],color=’y’, label=’Yahama’, linewidth=5)
plt.plot([],[],color=’m’, label=’KTM’, linewidth=5)
plt.stackplot(days, Enfield, Honda, Yahama, KTM, colors=[‘k’,’c’,’y’,’m’])
plt.xlabel(‘Days’)
plt.ylabel(‘Distance in kms’)
plt.title(‘Bikes deatils in area plot’)
plt.legend()
The above represented graph shows how an area plot can be plotted for the present scenario. Each shaded area in the graph shows a particular bike with the frequency variations denoting the distance covered by the bike on different days.
Pie Plot
In a pie plot, statistical data can be represented in a circular graph where the circle is divided into portions that denote a particular data, that is, each portion, called a slice, is proportional to different values in the data. This sort of a plot can be mainly used in mass media and business.
Example for a pie plot:
days = [1,2,3,4,5]
Enfield =[50,40,70,80,20]
Honda = [80,20,20,50,60]
Yahama =[70,20,60,40,60]
KTM = [80,20,20,50,60]
slices = [8,5,5,6]
activities = [‘Enfield’,’Honda’,’Yahama’,’KTM’]
cols = [‘c’,’g’,’y’,’b’]
plt.pie(slices,
labels=activities,
colors=cols,
startangle=90,
shadow= True,
explode=(0,0.1,0,0),
autopct=’%1.1f%%’)
plt.title(‘Bike details in Pie Plot’)
a
In the above represented pie plot, the bikes scenario is illustrated, and each slice represents a particular bike and the percentage of distance traveled by it.
Scatter Plot
Dot-based plotting of multiple variables along x and y axes represents a scatter plot. We can use different colors for different bikes if necessary for better plotting and identification of dots.
Example for a scatter plot:
days = [1, 2, 3, 4, 5]
Y1 = [50, 40, 70, 80, 20]
Y2=[80, 20, 20, 50, 60]
Y3=[70, 20, 60, 40, 60]
Y4=[80, 20, 20, 50, 60]
plt.scatter(days,Y1, label=’Enfield’,color=’r’)
plt.scatter(days,Y2,label=’Honda’,color=’b’)
plt.scatter(days,Y3,label=’Yahama’,color=’y’)
plt.scatter(days,Y4,label=’KTM’,color=’k’)
plt.xlabel(‘Days’)
plt.ylabel(‘Distance in kms’)
plt.title(‘ Bike details in Scatter Plot’)
plt.legend()
a
In the above represented scatter plot, various dots are scattered in the graph. Each colored dot represents a particular bike and the distance covered by it.
3D Plot
Plotting of data along x, y, and z axes to enhance the display of data represents the 3-dimensional plotting. 3D plotting is an advanced plotting technique that gives us a better view of the data representation along the three axes of the graph.
Example for a 3D plot:
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection=’3d’)
x = [1,2,3,4,5]
y = [50,40,70,80,20]
y2 = [80,20,20,50,60]
y3 = [70,20,60,40,60]
y4 = [80,20,20,50,60]
plt.plot(x,y,’g’,label=’Enfield’, linewidth=5)
plt.plot(x,y2,’c’,label=’Honda’,linewidth=5)
plt.plot(x,y3,’k’,label=’Yahama’,linewidth=5)
plt.plot(x,y4,’y’,label=’KTM’,linewidth=5)
plt.title(‘bike details in line plot’)
plt.ylabel(‘ Distance in kms’)
plt.xlabel(‘Days’)
plt.legend()
In the above represented 3D graph, a line graph is illustrated in a 3-dimensional manner. We make use of a special library to plot 3D graphs which is given in the following syntax.
Syntax for plotting 3D graphs:
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection=’3d’)
The import Axes3D is mainly used to create an axis by making use of the projection=3d keyword. This enables a 3-dimensional view of any data that can be written along with the above-mentioned code.
Histogram Plot
The plotting of numerical data in a precise manner by using rectangular blocks forms the basis of histogram plotting. A probability distribution can be estimated using a histogram plot. The data is mostly represented in a continuous manner based on the data set provided to plot the graph.
Example for a histogram plot:
days = [50,80,70,80,40,20,20,20,70,20,60,20,80,50,40,50,20,60,60,60]
bins = [0,10,20,40,50,60,70,80,90,100]
plt.hist(days, bins, histtype=’stepfilled’, rwidth=0.88)
plt.xlabel(‘Distance in kms’)
plt.ylabel(‘kilometer count’)
plt.title(‘bike details Histogram’)
The above histogram shows the stepfill pattern. There are various histypes that can be used, such as, bar, step, stepfill, etc. Histogram does not include spaces between the blocks. It is a continuous structure denoting the distance count that is the number of times the same distance is covered within a span of five days by the bikes along the Y-axis and the Distance in kms along the X-axis.
Conclusion
With this, we come to the end of this module in Python tutorial. We learnt how to implement various types of plotting techniques. Hopefully, this tutorial served as a good demonstration about what is possible by Python Matplotlib. Dealing with multiple or huge amount of data and representing them in graphs for better understanding are the major uses of Matplotlib in Python. Now, if you are interested in knowing why python is the most preferred language for Data Science, you can go through this blog on Python for Data Science.
Further, check out our offers for Python training Course and also refer to the trending Python interview questions prepared by the industry experts.