Graphics and Visualization in Python¶
Python has a bewildering number of graphics libraries, catering to different needs. If you want a better understanding of the Python visualization landscape, see the following series of blog posts:
However for simple statistical plots, the matplotlib
and seaborn
libraries suffice most of the time:
As you have seen, pandas
also has useful plotting functionality.
In [1]:
%matplotlib inline
import numpy as np
import numpy.random as rng
import pandas as pd
import scipy.stats as stats
import matplotlib.pyplot as plt
import matplotlib as mpl
import seaborn as sns
In [2]:
import warnings
warnings.filterwarnings("ignore")
Matplotlib¶
Matplotlib has a “functional” interface similar to Matlab via the
pyplot
module for simple interactive use, as well as an
object-oriented interface that is useful for more complex graphic
creations.
Types of plots¶
In [ ]:
In [3]:
plt.hist(np.random.randn(1000), bins=np.linspace(-4,4,11))
pass

In [4]:
xs = [np.random.normal(mu, 0.5, (100)) for mu in range(5)]
In [5]:
for x in xs:
plt.hist(x, bins=15, alpha=0.4)

In [6]:
plt.boxplot(np.random.random((6,10)))
pass

In [7]:
plt.scatter(*np.random.uniform(0.1, 0.9, (2,100)),
s=np.random.randint(10, 200, 100),
c=np.random.random(100))
pass

In [8]:
x = y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)
Z = X**2 + Y**2
plt.contourf(X, Y, Z, cmap=plt.cm.RdPu)
plt.axis('square')
pass

In [9]:
plt.stem(np.random.random(8))
plt.margins(0.05)
pass

In [10]:
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x)
In [11]:
plt.plot(x, y)
plt.axis([0, 2*np.pi, -1.05, 1.05,])
pass

In [12]:
xs = np.c_[np.zeros(10), np.random.choice([-1,1], (10, 100)).cumsum(axis=1)]
plt.plot(xs.T)
plt.title('10 random walks', fontsize=14)
pass

Displaying arrays¶
In [13]:
x = np.random.random((80, 80, 3))
In [14]:
plt.imshow(x)
pass

In [15]:
plt.imshow(x, interpolation='bicubic')
pass

In [16]:
plt.imshow(x.mean(axis=-1), cmap='bone')
pass

In [17]:
plt.imshow(x.mean(axis=-1), cmap='Reds')
plt.xticks(range(0, x.shape[1], 4))
plt.yticks(range(0, x.shape[0], 4))
plt.grid(color='white')
ax = plt.gca()
ax.set_xticklabels([])
ax.set_yticklabels([])
ax.xaxis.set_ticks_position('none')
ax.yaxis.set_ticks_position('none')
pass

Colors¶
In [18]:
plt.scatter(*np.random.uniform(0.1, 0.9, (2,100)),
s=np.random.randint(10, 200, 100),
c=np.random.random(100))
pass

In [19]:
plt.scatter(*np.random.uniform(0.1, 0.9, (2,100)),
s=np.random.randint(10, 200, 100),
c=np.random.random(100), cmap='summer')
pass

In [20]:
plt.scatter(*np.random.uniform(0.1, 0.9, (2,100)),
s=np.random.randint(10, 200, 100),
c=np.random.random(100), cmap='hsv')
pass

Getting a list of colors from a colormap¶
Giving an argument of 0.0 < x < 1.0 to a colormap
gives the
appropriate interpolated color.
In [21]:
# find the bottom, middle and top colors of the winter colormap
colors = plt.cm.winter(np.linspace(0, 1, 100))
colors[:3]
Out[21]:
array([[0. , 0. , 1. , 1. ],
[0. , 0.00784314, 0.99607843, 1. ],
[0. , 0.01960784, 0.99019608, 1. ]])
In [22]:
plt.scatter(*np.random.uniform(0.1, 0.9, (2,100)),
s=np.random.randint(10, 200, 100),
c=colors)
pass

Styles¶
In [23]:
plt.style.available
Out[23]:
['seaborn-dark',
'seaborn-darkgrid',
'seaborn-ticks',
'fivethirtyeight',
'seaborn-whitegrid',
'classic',
'_classic_test',
'fast',
'seaborn-talk',
'seaborn-dark-palette',
'seaborn-bright',
'seaborn-pastel',
'grayscale',
'seaborn-notebook',
'ggplot',
'seaborn-colorblind',
'seaborn-muted',
'seaborn',
'Solarize_Light2',
'seaborn-paper',
'bmh',
'tableau-colorblind10',
'seaborn-white',
'dark_background',
'seaborn-poster',
'seaborn-deep']
In [24]:
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x)
In [25]:
with plt.style.context('classic'):
plt.plot(x, y)
plt.axis([0, 2*np.pi, -1.05, 1.05,])

In [26]:
with plt.style.context('fivethirtyeight'):
plt.plot(x, y)
plt.axis([0, 2*np.pi, -1.05, 1.05,])

In [27]:
with plt.style.context('ggplot'):
plt.plot(x, y)
plt.axis([0, 2*np.pi, -1.05, 1.05,])

In [28]:
with plt.style.context('seaborn-darkgrid'):
plt.plot(x, y)
plt.axis([0, 2*np.pi, -1.05, 1.05,])

In [29]:
with plt.xkcd():
plt.plot(x, y)
plt.axis([0, 2*np.pi, -1.05, 1.05,])

Creating your own style¶
Many, many options can be configured.
In [30]:
plt.rcParams
Out[30]:
RcParams({'_internal.classic_mode': False,
'agg.path.chunksize': 0,
'animation.avconv_args': [],
'animation.avconv_path': 'avconv',
'animation.bitrate': -1,
'animation.codec': 'h264',
'animation.convert_args': [],
'animation.convert_path': 'convert',
'animation.embed_limit': 20.0,
'animation.ffmpeg_args': [],
'animation.ffmpeg_path': 'ffmpeg',
'animation.frame_format': 'png',
'animation.html': 'none',
'animation.html_args': [],
'animation.writer': 'ffmpeg',
'axes.autolimit_mode': 'data',
'axes.axisbelow': 'line',
'axes.edgecolor': 'black',
'axes.facecolor': 'white',
'axes.formatter.limits': [-7, 7],
'axes.formatter.min_exponent': 0,
'axes.formatter.offset_threshold': 4,
'axes.formatter.use_locale': False,
'axes.formatter.use_mathtext': False,
'axes.formatter.useoffset': True,
'axes.grid': False,
'axes.grid.axis': 'both',
'axes.grid.which': 'major',
'axes.labelcolor': 'black',
'axes.labelpad': 4.0,
'axes.labelsize': 'medium',
'axes.labelweight': 'normal',
'axes.linewidth': 0.8,
'axes.prop_cycle': cycler('color', ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']),
'axes.spines.bottom': True,
'axes.spines.left': True,
'axes.spines.right': True,
'axes.spines.top': True,
'axes.titlepad': 6.0,
'axes.titlesize': 'large',
'axes.titleweight': 'normal',
'axes.unicode_minus': True,
'axes.xmargin': 0.05,
'axes.ymargin': 0.05,
'axes3d.grid': True,
'backend': 'module://ipykernel.pylab.backend_inline',
'backend.qt4': None,
'backend.qt5': None,
'backend_fallback': True,
'boxplot.bootstrap': None,
'boxplot.boxprops.color': 'black',
'boxplot.boxprops.linestyle': '-',
'boxplot.boxprops.linewidth': 1.0,
'boxplot.capprops.color': 'black',
'boxplot.capprops.linestyle': '-',
'boxplot.capprops.linewidth': 1.0,
'boxplot.flierprops.color': 'black',
'boxplot.flierprops.linestyle': 'none',
'boxplot.flierprops.linewidth': 1.0,
'boxplot.flierprops.marker': 'o',
'boxplot.flierprops.markeredgecolor': 'black',
'boxplot.flierprops.markerfacecolor': 'none',
'boxplot.flierprops.markersize': 6.0,
'boxplot.meanline': False,
'boxplot.meanprops.color': 'C2',
'boxplot.meanprops.linestyle': '--',
'boxplot.meanprops.linewidth': 1.0,
'boxplot.meanprops.marker': '^',
'boxplot.meanprops.markeredgecolor': 'C2',
'boxplot.meanprops.markerfacecolor': 'C2',
'boxplot.meanprops.markersize': 6.0,
'boxplot.medianprops.color': 'C1',
'boxplot.medianprops.linestyle': '-',
'boxplot.medianprops.linewidth': 1.0,
'boxplot.notch': False,
'boxplot.patchartist': False,
'boxplot.showbox': True,
'boxplot.showcaps': True,
'boxplot.showfliers': True,
'boxplot.showmeans': False,
'boxplot.vertical': True,
'boxplot.whiskerprops.color': 'black',
'boxplot.whiskerprops.linestyle': '-',
'boxplot.whiskerprops.linewidth': 1.0,
'boxplot.whiskers': 1.5,
'contour.corner_mask': True,
'contour.negative_linestyle': 'dashed',
'datapath': '/usr/local/lib/python3.6/site-packages/matplotlib/mpl-data',
'date.autoformatter.day': '%Y-%m-%d',
'date.autoformatter.hour': '%m-%d %H',
'date.autoformatter.microsecond': '%M:%S.%f',
'date.autoformatter.minute': '%d %H:%M',
'date.autoformatter.month': '%Y-%m',
'date.autoformatter.second': '%H:%M:%S',
'date.autoformatter.year': '%Y',
'docstring.hardcopy': False,
'errorbar.capsize': 0.0,
'examples.directory': '',
'figure.autolayout': False,
'figure.constrained_layout.h_pad': 0.04167,
'figure.constrained_layout.hspace': 0.02,
'figure.constrained_layout.use': False,
'figure.constrained_layout.w_pad': 0.04167,
'figure.constrained_layout.wspace': 0.02,
'figure.dpi': 72.0,
'figure.edgecolor': (1, 1, 1, 0),
'figure.facecolor': (1, 1, 1, 0),
'figure.figsize': [6.0, 4.0],
'figure.frameon': True,
'figure.max_open_warning': 20,
'figure.subplot.bottom': 0.125,
'figure.subplot.hspace': 0.2,
'figure.subplot.left': 0.125,
'figure.subplot.right': 0.9,
'figure.subplot.top': 0.88,
'figure.subplot.wspace': 0.2,
'figure.titlesize': 'large',
'figure.titleweight': 'normal',
'font.cursive': ['Apple Chancery',
'Textile',
'Zapf Chancery',
'Sand',
'Script MT',
'Felipa',
'cursive'],
'font.family': ['sans-serif'],
'font.fantasy': ['Comic Sans MS',
'Chicago',
'Charcoal',
'Impact',
'Western',
'Humor Sans',
'xkcd',
'fantasy'],
'font.monospace': ['DejaVu Sans Mono',
'Bitstream Vera Sans Mono',
'Computer Modern Typewriter',
'Andale Mono',
'Nimbus Mono L',
'Courier New',
'Courier',
'Fixed',
'Terminal',
'monospace'],
'font.sans-serif': ['DejaVu Sans',
'Bitstream Vera Sans',
'Computer Modern Sans Serif',
'Lucida Grande',
'Verdana',
'Geneva',
'Lucid',
'Arial',
'Helvetica',
'Avant Garde',
'sans-serif'],
'font.serif': ['DejaVu Serif',
'Bitstream Vera Serif',
'Computer Modern Roman',
'New Century Schoolbook',
'Century Schoolbook L',
'Utopia',
'ITC Bookman',
'Bookman',
'Nimbus Roman No9 L',
'Times New Roman',
'Times',
'Palatino',
'Charter',
'serif'],
'font.size': 10.0,
'font.stretch': 'normal',
'font.style': 'normal',
'font.variant': 'normal',
'font.weight': 'normal',
'grid.alpha': 1.0,
'grid.color': '#b0b0b0',
'grid.linestyle': '-',
'grid.linewidth': 0.8,
'hatch.color': 'black',
'hatch.linewidth': 1.0,
'hist.bins': 10,
'image.aspect': 'equal',
'image.cmap': 'viridis',
'image.composite_image': True,
'image.interpolation': 'nearest',
'image.lut': 256,
'image.origin': 'upper',
'image.resample': True,
'interactive': True,
'keymap.all_axes': ['a'],
'keymap.back': ['left', 'c', 'backspace'],
'keymap.copy': ['ctrl+c', 'cmd+c'],
'keymap.forward': ['right', 'v'],
'keymap.fullscreen': ['f', 'ctrl+f'],
'keymap.grid': ['g'],
'keymap.grid_minor': ['G'],
'keymap.help': ['f1'],
'keymap.home': ['h', 'r', 'home'],
'keymap.pan': ['p'],
'keymap.quit': ['ctrl+w', 'cmd+w', 'q'],
'keymap.quit_all': ['W', 'cmd+W', 'Q'],
'keymap.save': ['s', 'ctrl+s'],
'keymap.xscale': ['k', 'L'],
'keymap.yscale': ['l'],
'keymap.zoom': ['o'],
'legend.borderaxespad': 0.5,
'legend.borderpad': 0.4,
'legend.columnspacing': 2.0,
'legend.edgecolor': '0.8',
'legend.facecolor': 'inherit',
'legend.fancybox': True,
'legend.fontsize': 'medium',
'legend.framealpha': 0.8,
'legend.frameon': True,
'legend.handleheight': 0.7,
'legend.handlelength': 2.0,
'legend.handletextpad': 0.8,
'legend.labelspacing': 0.5,
'legend.loc': 'best',
'legend.markerscale': 1.0,
'legend.numpoints': 1,
'legend.scatterpoints': 1,
'legend.shadow': False,
'legend.title_fontsize': None,
'lines.antialiased': True,
'lines.color': 'C0',
'lines.dash_capstyle': 'butt',
'lines.dash_joinstyle': 'round',
'lines.dashdot_pattern': [6.4, 1.6, 1.0, 1.6],
'lines.dashed_pattern': [3.7, 1.6],
'lines.dotted_pattern': [1.0, 1.65],
'lines.linestyle': '-',
'lines.linewidth': 1.5,
'lines.marker': 'None',
'lines.markeredgecolor': 'auto',
'lines.markeredgewidth': 1.0,
'lines.markerfacecolor': 'auto',
'lines.markersize': 6.0,
'lines.scale_dashes': True,
'lines.solid_capstyle': 'projecting',
'lines.solid_joinstyle': 'round',
'markers.fillstyle': 'full',
'mathtext.bf': 'sans:bold',
'mathtext.cal': 'cursive',
'mathtext.default': 'it',
'mathtext.fallback_to_cm': True,
'mathtext.fontset': 'dejavusans',
'mathtext.it': 'sans:italic',
'mathtext.rm': 'sans',
'mathtext.sf': 'sans',
'mathtext.tt': 'monospace',
'patch.antialiased': True,
'patch.edgecolor': 'black',
'patch.facecolor': 'C0',
'patch.force_edgecolor': False,
'patch.linewidth': 1.0,
'path.effects': [],
'path.simplify': True,
'path.simplify_threshold': 0.1111111111111111,
'path.sketch': None,
'path.snap': True,
'pdf.compression': 6,
'pdf.fonttype': 3,
'pdf.inheritcolor': False,
'pdf.use14corefonts': False,
'pgf.preamble': [],
'pgf.rcfonts': True,
'pgf.texsystem': 'xelatex',
'polaraxes.grid': True,
'ps.distiller.res': 6000,
'ps.fonttype': 3,
'ps.papersize': 'letter',
'ps.useafm': False,
'ps.usedistiller': False,
'savefig.bbox': None,
'savefig.directory': '~',
'savefig.dpi': 'figure',
'savefig.edgecolor': 'white',
'savefig.facecolor': 'white',
'savefig.format': 'png',
'savefig.frameon': True,
'savefig.jpeg_quality': 95,
'savefig.orientation': 'portrait',
'savefig.pad_inches': 0.1,
'savefig.transparent': False,
'scatter.marker': 'o',
'svg.fonttype': 'path',
'svg.hashsalt': None,
'svg.image_inline': True,
'text.antialiased': True,
'text.color': 'black',
'text.hinting': 'auto',
'text.hinting_factor': 8,
'text.latex.preamble': [],
'text.latex.preview': False,
'text.latex.unicode': True,
'text.usetex': False,
'timezone': 'UTC',
'tk.window_focus': False,
'toolbar': 'toolbar2',
'verbose.fileo': 'sys.stdout',
'verbose.level': 'silent',
'webagg.address': '127.0.0.1',
'webagg.open_in_browser': True,
'webagg.port': 8988,
'webagg.port_retries': 50,
'xtick.alignment': 'center',
'xtick.bottom': True,
'xtick.color': 'black',
'xtick.direction': 'out',
'xtick.labelbottom': True,
'xtick.labelsize': 'medium',
'xtick.labeltop': False,
'xtick.major.bottom': True,
'xtick.major.pad': 3.5,
'xtick.major.size': 3.5,
'xtick.major.top': True,
'xtick.major.width': 0.8,
'xtick.minor.bottom': True,
'xtick.minor.pad': 3.4,
'xtick.minor.size': 2.0,
'xtick.minor.top': True,
'xtick.minor.visible': False,
'xtick.minor.width': 0.6,
'xtick.top': False,
'ytick.alignment': 'center_baseline',
'ytick.color': 'black',
'ytick.direction': 'out',
'ytick.labelleft': True,
'ytick.labelright': False,
'ytick.labelsize': 'medium',
'ytick.left': True,
'ytick.major.left': True,
'ytick.major.pad': 3.5,
'ytick.major.right': True,
'ytick.major.size': 3.5,
'ytick.major.width': 0.8,
'ytick.minor.left': True,
'ytick.minor.pad': 3.4,
'ytick.minor.right': True,
'ytick.minor.size': 2.0,
'ytick.minor.visible': False,
'ytick.minor.width': 0.6,
'ytick.right': False})
In [31]:
%%file foo.mplstyle
axes.grid: True
axes.titlesize : 24
axes.labelsize : 20
lines.linewidth : 3
lines.markersize : 10
xtick.labelsize : 16
ytick.labelsize : 16
Overwriting foo.mplstyle
In [32]:
with plt.style.context('foo.mplstyle'):
plt.plot(x, y)
plt.axis([0, 2*np.pi, -1.05, 1.05,])

Customizing plots¶
In [33]:
plt.rcParams.update({'font.size': 22})
fig = plt.figure(figsize=(8,6))
ax = plt.subplot(1,1,1)
plt.plot(x, y, color='red', linewidth=2, linestyle='dashed', label='sine curve')
plt.plot(x, np.cos(x), 'b-', label='cosine curve')
plt.legend(loc='best', fontsize=14)
plt.axis([0, 2*np.pi, -1.05, 1.05,])
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.xticks([0,0.5*np.pi,np.pi,1.5*np.pi,2*np.pi],
[0, r'$\frac{\pi}{2}$', r'$\pi$', r'$\frac{3\pi}{2}$', r'$2\pi$'])
plt.title('Sine and Cosine Plots')
plt.text(0.45, 0.9, 'Empty space', transform=ax.transAxes, ha='left', va='top')
pass

In [34]:
x = np.random.randn(100)
plt.hist(x, bins=25, histtype='step', normed=True)
mu, sigma = stats.norm.fit(x)
xp = np.linspace(*plt.xlim(), 100)
plt.plot(xp, stats.norm(mu, sigma).pdf(xp))
plt.xlabel('x')
plt.ylabel('Density')
plt.title('MLE fit for normal distribution', fontsize=14)
pass

Layouts¶
In [35]:
fig, axes = plt.subplots(2,2,figsize=(8,8))
axes[0,0].plot(x,y, 'r')
axes[0,1].plot(x,y, 'g')
axes[1,0].plot(x,y, 'b')
axes[1,1].plot(x,y, 'k')
for ax in axes.ravel():
ax.margins(0.05)
pass

In [36]:
ax1 = plt.subplot2grid((3,3), (0,0), colspan=3)
ax2 = plt.subplot2grid((3,3), (1,0), colspan=2)
ax3 = plt.subplot2grid((3,3), (1,2), rowspan=2)
ax4 = plt.subplot2grid((3,3), (2,0), colspan=2)
axes = [ax1, ax2, ax3, ax4]
colors = ['r', 'g', 'b', 'k']
for ax, c in zip(axes, colors):
ax.plot(x, y, c)
ax.margins(0.05)
plt.tight_layout()

Seaborn¶
In [37]:
sns.set_context("notebook", font_scale=1.5, rc={"lines.linewidth": 2.5})
Density plots¶
In [38]:
xs = rng.normal(0,1,100)
fig, axes = plt.subplots(1, 2, figsize=(8,4))
sns.distplot(xs, hist=False, rug=True, ax=axes[0]);
sns.distplot(xs, hist=True, ax=axes[1])
pass

Kernel density estimate¶
In [39]:
sns.kdeplot(np.r_[rng.normal(0,1,50), rng.normal(4,0.8,100)])
pass

In [40]:
iris = sns.load_dataset('iris')
In [41]:
iris.head()
Out[41]:
sepal_length | sepal_width | petal_length | petal_width | species | |
---|---|---|---|---|---|
0 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
1 | 4.9 | 3.0 | 1.4 | 0.2 | setosa |
2 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
3 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
4 | 5.0 | 3.6 | 1.4 | 0.2 | setosa |
Joint distribution plot¶
In [42]:
sns.jointplot(x='petal_length', y='petal_width', data=iris, kind='kdeplot')
pass

Box and violin plots¶
In [43]:
fig, axes = plt.subplots(1, 2, figsize=(8,4))
sns.boxplot(x='species', y='petal_length', data=iris, ax=axes[0])
sns.violinplot(x='species', y='petal_length', data=iris, ax=axes[1])
pass

Composite plots¶
In [44]:
url = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/titanic.csv'
titanic = pd.read_csv(url)
In [45]:
titanic.head()
Out[45]:
survived | pclass | sex | age | sibsp | parch | fare | embarked | class | who | adult_male | deck | embark_town | alive | alone | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 3 | male | 22.0 | 1 | 0 | 7.2500 | S | Third | man | True | NaN | Southampton | no | False |
1 | 1 | 1 | female | 38.0 | 1 | 0 | 71.2833 | C | First | woman | False | C | Cherbourg | yes | False |
2 | 1 | 3 | female | 26.0 | 0 | 0 | 7.9250 | S | Third | woman | False | NaN | Southampton | yes | True |
3 | 1 | 1 | female | 35.0 | 1 | 0 | 53.1000 | S | First | woman | False | C | Southampton | yes | False |
4 | 0 | 3 | male | 35.0 | 0 | 0 | 8.0500 | S | Third | man | True | NaN | Southampton | no | True |
In [46]:
sns.lmplot(x='fare', y='survived', col='alone', row='sex', data=titanic, logistic=True)
pass

In [47]:
g = sns.PairGrid(titanic,
y_vars=['fare', 'age'],
x_vars=['sex', 'class', 'embark_town' ],
aspect=1, size=5.5)
g.map(sns.stripplot, jitter=True, palette="bright")
pass

Seaborn styles¶
In [48]:
sns.set_style('ticks')
In [49]:
fig, axes = plt.subplots(1, 2, figsize=(8,4))
sns.distplot(xs, hist=False, rug=True, ax=axes[0]);
sns.distplot(xs, hist=True, ax=axes[1])
pass

In [50]:
sns.set_style('dark')
In [51]:
xs = rng.normal(0,1,100)
fig, axes = plt.subplots(1, 2, figsize=(8,4))
sns.distplot(xs, hist=False, rug=True, ax=axes[0]);
sns.distplot(xs, hist=True, ax=axes[1])
pass

In [52]:
sns.set_style('darkgrid')
In [53]:
fig, axes = plt.subplots(1, 2, figsize=(8,4))
sns.distplot(xs, hist=False, rug=True, ax=axes[0]);
sns.distplot(xs, hist=True, ax=axes[1])
pass

In [ ]: