In this series of Matplotlib, will going to cover “Data Visualization in Python using Matplotlib”. We will discuss what is visualization, why is it necessary for a Data Scientist. In the whole series will try to explore charts with a help of case study.
What is data visualization and its need?
Data Visualization is basically telling a story from your data and understand it in order to use for solving a Machine Learning Problem. It is necessary to do data visualization most of the times because it helps in portraying complex data in the simple form. It is good when someone wants to get acquainted with data in order to explore it.
Introduction to matplotlib
Matplotlib is one of the known and good library for data visualization in Python. It is a Python-based plotting library with full support for 2D and limited support for 3D graphics. It was thus originally developed as an EEG/ECoG visualization tool for this GTK+ application. Matplotlib provides access to the data structures of Python such as list, dictionary. Matplotlib can plot graphs of pandas dataframe easily. If you are not known to basic manipulations with Pandas, I suggest you go through Pandas tutorial.
Architecture of matplotlib
Diagram of its architecture
It has three layers -:
- Backend Layer
- Artist Layer
- Scripting Layer
Backend Layer -: The backend layer is the bottom-most layer of the stack of layers and it has main 3 objects -:
- FigureCanvas – It handles the whole concept of the plot
- Renderer -: It works on adding labels, ticks etc.
- Event -: handles user input etc.
Artist Layer -: It is the middle layer in the stack and everything one can see on a figure such as axes, labels, etc are drawn with the artist layer. It is using an Axes instance from Matplotlib.
How to use Artist Layer -:
ax = data.plot(kind=’area’)
Scripting Layer -: It is the top most layer and generally used by professionals for fast data exploratory analysis. Artist layer is being used by developers.
import matplotlib.pyplot as plt import numpy as np x = np.random.randn(10000) plt.hist(x, 100) plt.title(r'Normal distribution with $\mu=0, \sigma=1$')
So, this is the architecture of Matplotlib and in the upcoming blogs will see how to do data visualization using matplotlib.