import matplotlib.pyplot as plt
Introduction
Data visualization is one of the main steps on the way to understanding a dataset. General information on data visualization (beyond Python) can be found in the following list:
A visualization guide from data.europa.eu: The official portal for European data
Data stories can help provide new ideas for your own work: Maarten Lambrechts’s website
How to choose your chart by Andrew V. Abela:
.
- A blog post by Felipe Curty investigating the Python data visualization landscape
A major difference in the visualization solutions relies on the possibility of performing interactive inspection; otherwise, the solution is said static.
Interactive tools for data visualization are emerging in Python with plotly
, altair
, Bokeh
, etc. An extensive study by Aarron Geller provides the pros and cons of each method.
Python
The list is long (and growing) of Python packages for data visualization. We provide some examples in the pandas
section of the website, and also in the Scipy course.
Generic tools
matplotlib
: Visualization with Python
Source: https://matplotlib.org/.
This is the standard library for plots in Python. The documentation is well written and matplotlib
should be the default choice for creating static documents (e.g., .pdf
or .doc
files).
Usual loading command:
Example:
import numpy as np
import matplotlib.pyplot as plt
= np.linspace(0, 2 * np.pi, 1024)
t = np.sin(2 * np.pi * t)
ft1 = np.cos(2 * np.pi * t)
ft2 = plt.subplots()
fig, ax ='sin')
ax.plot(t, ft1, label='cos')
ax.plot(t, ft2, label='lower right'); ax.legend(loc
seaborn
: statistical data visualization
Source: https://seaborn.pydata.org/.
seaborn
is built over matplotlib
and is specifically tailored for data visualization (maptlotlib
is a more flexible and general tool). Default settings are usually nicer than one from maptlotlib
, especially for standard tools (histograms, KDE, swarmplots, etc.).
Usual loading command:
import seaborn as sns
Example:
import seaborn as sns
import pandas as pd
= pd.DataFrame(dict(sin=ft1, cos=ft2))
df "whitegrid")
sns.set_style(= sns.lineplot(data=df)
ax "lower right")
sns.move_legend(ax, sns.despine()
plotly
: a graphing library for Python
Source: https://plotly.com/python/.
The force of plotly
is that it is interactive and can handle R software
or julia
on top of Python (it relies on Java Script under the hood).
Usual loading command:
import plotly
Alternatively, you can also use plotly.express
to use predefined figures:
import plotly.express as px
import plotly.express as px
= px.line(df)
fig fig.show()
In plotly
the figure is interactive. If you click on the legend on the right, you can select a curve to activate/deactivate.
But now you can also create a slider to change a parameter, for instance showing the functions
\begin{align*} f_w: t \to \sin(2 \cdot \pi \cdot w \cdot t)\\ g_w: t \to \sin(2 \cdot \pi \cdot w \cdot t) \end{align*} for w \in [-5, 5]
# inspiration from:
# https://community.plotly.com/t/multiple-traces-with-a-single-slider-in-plotly/16356
import plotly.graph_objects as go
import numpy as np
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode()
= 101
num_steps = np.linspace(-5, 5 , num=num_steps)
slider_range = []
trace_list1 = []
trace_list2
for i, w in enumerate(slider_range):
=np.sin(2*np.pi*t*w), visible=False, line={'color': 'red'}, name=f"sin(w * 2 *pi)"))
trace_list1.append(go.Scatter(y=np.cos(2*np.pi*t *w), visible=False, line={'color': 'blue'}, name=f"cos(w * 2 *pi)"))
trace_list2.append(go.Scatter(y
= go.Figure(data=trace_list1+trace_list2)
fig
# Initialize display:
51].visible = True
fig.data[51 + num_steps].visible = True
fig.data[
= []
steps for i in range(num_steps):
# Hide all traces
= dict(
step = 'restyle',
method = ['visible', [False] * len(fig.data)],
args =f"{w:.2f}"
label
)# Enable the two traces we want to see
'args'][1][i] = True
step['args'][1][i+num_steps] = True
step[
# Add step to steps list
steps.append(step)
= [dict(
sliders = 50,
active ={"prefix": "w = "},
currentvalue= steps,
steps
)]
= sliders
fig.layout.sliders
=False) iplot(fig, show_link