================ by Jawad Haider
Chpt 4 - Visualization with Matplotlib¶
Example: Exploring Marathon Finishing Times¶
Example: Exploring Marathon Finishing Times¶
Here we’ll look at using Seaborn to help visualize and understand finishing results from a marathon. I’ve scraped the data from sources on the Web, aggregated it and removed any identifying information, and put it on GitHub where it can be downloa‐ ded (if you are interested in using Python for web scraping, I would recommend Web Scraping with Python by Ryan Mitchell). We will start by downloading the data from the Web, and loading it into Pandas:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 836k 100 836k 0 0 9808 0 0:01:27 0:01:27 --:--:-- 11084 13626 0 0:01:02 0:00:11 0:00:51 10234
import matplotlib.pyplot as plt
plt.style.use('classic')
%matplotlib inline
import numpy as np
import pandas as pd
'01general Matplotlib tips.ipynb'
02simple_lineplots.ipynb
'03simple scatter plots.ipynb'
'04visualizing errors.ipynb'
'05density and contour plots.ipynb'
'06Histograms Binnings and Density.ipynb'
'07customized plot legends.ipynb'
'08customizing colorbar.ipynb'
'09multiple subplots.ipynb'
'10text and annotation Example.ipynb'
'11customizing ticks.ipynb'
'12customizing matplotlib configuration and stylesheets.ipynb'
'13threedimensional plotting.ipynb'
'14_geographic data with basemap.ipynb'
'15visualiztion with seaborn.ipynb'
cos_sinplots.png
'example California cities.ipynb'
'Example Exploring Marathon Finishing times.ipynb'
'Example Handwritten Digits.ipynb'
'example surface temperature data.ipynb'
'Example Visualizing a Mobius Strip.ipynb'
gistemp250.nc.gz
marathon-data.csv
age | gender | split | final | |
---|---|---|---|---|
0 | 33 | M | 01:05:38 | 02:08:51 |
1 | 32 | M | 01:06:26 | 02:09:28 |
2 | 31 | M | 01:06:49 | 02:10:42 |
3 | 38 | M | 01:06:16 | 02:13:45 |
4 | 31 | M | 01:06:32 | 02:13:59 |
age int64
gender object
split object
final object
dtype: object
# lets convert split and final to times
def convert_time(s):
h,m,s=map(int,s.split(':'))
return pd.datetools.timedelta(hours=h, minutes=m, seconds=s)
data = pd.read_csv('marathon-data.csv',converters={'split':convert_time, 'final':convert_time})
data.head()
AttributeError: module 'pandas' has no attribute 'datetools'