Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (17.6k points)
edited by

I have this kind of data :

ID    A1 A2   A3 A4       A5 A6 A7     A8 A9 A10

1   -0.18   8 -0.30   -0.26 0.53   -0.16 0.20 2 -0.20    2

2   -0.58   8 -0.89   -1.66 -0.91   -0.35 0.78 3  3.03 6

3   -0.62   -7 -0.67   -0.38 0.26   0.28 0.94 4  1.49 8

4   -0.22   -3 1.64   -1.38 0.54   0.57 1.64 5 -0.34    9

5    0.00   5 1.32   -1.16 0.78    0.68 0.72 5 -0.51    0

what's the best method for visualizing this data, i'm using matplotlib to visualizing it, and read it from csv using pandas?

1 Answer

0 votes
by (41.4k points)
edited by

Matplotlib

It is a multiplatform data visualization library built on NumPy arrays, and they are designed to work with the broader SciPy stack.

Matplotlib is specifically good for creating basic graphs like line charts, bar charts, histograms and many more. 

One of Matplotlib’s most important features is its ability to play well with many operating systems and graphics backends. 

Matplotlib supports dozens of backends and output types, which means you can count on it to work regardless of which operating system you are using or which output format you wish. 

This cross-platform, everything-to-everyone approach has been one of the great strengths of Matplotlib.

Pandas Visualization

Pandas is an open source high-performance, easy-to-use library providing data structures, such as dataframes, and data analysis tools like the visualization tools we will use in this article.

Pandas Visualization makes it really easy to create plots out of a pandas dataframe and series.

 It also has a higher level API than Matplotlib and therefore we need less code for the same results.

Seaborn

Seaborn is a Python data visualization library based on Matplotlib. It provides a high-level interface for creating attractive graphs.

Seaborn has a lot to offer. You can create graphs in one line that would take you multiple tens of lines in Matplotlib. 

Its standard designs are awesome and it also has a nice interface for working with pandas dataframes

Below is the code that shows how to do this with pandas.

import numpy as np 

import pandas as ps

from pandas.tools.plotting import scatter_matrix

data = ps.DataFrame(np.random.randn(100, 10), columns=['A1', 'A2', 'A3','A4','A5','A6','A7','A8','A9','A10'])

#Plotting using pandas 

scatter_matrix(data, alpha=0.4, figsize=(5, 5), diagonal='kde')

You may change the plot over the time, for each instant you plot a different "dimension" of the dataframe. You can do plots that change over the time, you may adjust it according to your use.

import matplotlib.pyplot as mpl

import numpy as np

figure = mpl.figure()

d = figure.add_subplot(111)

mpl.grid(True)

mpl.hold(False)

x = np.arange(-3, 3, 0.01)

for n in range(15):

    y = np.sin(np.pi*x*n) / (np.pi*x*n)

    line, = d.plot(x, y)

    mpl.draw()

    mpl.pause(0.7)

Learn more about Matplotlib by watching this video tutorial:

If you want to be build successful data science career then enroll for best data science certification.

Browse Categories

...