Data is Beautiful

A practical book on data visualisation that shows you how to create static and interactive visualisations that are engaging and beautiful.

Get the book

Coronavirus Time Series Line and Bar Chart

Support this work

You can access this notebook and more by getting the e-book, Data is Beautiful.

Preamble

In [1]:
import numpy as np                   # for multi-dimensional containers 
import pandas as pd                  # for DataFrames
import plotly
import plotly.graph_objects as go    # for data visualisation
from plotly.subplots import make_subplots

Introduction

In this section, we're going to use daily confirmed cases data for COVID-19 in the UK made available at coronavirus.data.gov.uk to create a time series plot. Our goal will be to visualise the number of new cases and cumulative cases over time.

Terms of use taken from the data source

No special restrictions or limitations on using the item’s content have been provided.

Bunny

Visualising the Table

The first step is to read the CSV data into a pandas.DataFrame and display the first five samples.

In [2]:
data = pd.read_csv('https://shahinrostami.com/datasets/coronavirus-cases_latest.csv')
data.head()
Out[2]:
areaType areaName areaCode date newCasesByPublishDate cumCasesByPublishDate
0 nation England E92000001 2020-07-22 519.0 255038
1 nation Northern Ireland N92000002 2020-07-22 9.0 5868
2 nation Scotland S92000003 2020-07-22 10.0 18484
3 nation Wales W92000004 2020-07-22 22.0 16987
4 nation England E92000001 2020-07-21 399.0 254519

Let's filter this data to only include rows where the Area name is England.

In [4]:
data = data[data['areaName']=='England']
data.head()
Out[4]:
areaType areaName areaCode date newCasesByPublishDate cumCasesByPublishDate
0 nation England E92000001 2020-07-22 519.0 255038
4 nation England E92000001 2020-07-21 399.0 254519
8 nation England E92000001 2020-07-20 535.0 254120
12 nation England E92000001 2020-07-19 672.0 253585
16 nation England E92000001 2020-07-18 796.0 252913

This data looks ready to plot. We have our dates in a column named Specimen date, the new daily cases in a column named Daily lab-confirmed cases, and the daily cumulative cases in a column named Cumulative lab-confirmed cases. For this plot, we'll enable a secondary y-axis so that we can present our cumulative cases as a line, and our new cases with bars.

In [6]:
from plotly.subplots import make_subplots

fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(go.Scatter(x=data['date'], y=data['cumCasesByPublishDate'],
                         mode='lines+markers',
                         name='Total Cases',
                         line_color='crimson'),
                         secondary_y=True)

fig.add_trace(go.Bar(x=data['date'], y=data['newCasesByPublishDate'],
                     name='New Cases',
                     marker_color='darkslategray'),
                     secondary_y=False)
fig.show()

It's an interactive plot, so you can hover over it to get more information.

Conclusion

In this section, we went on a rather quick journey. This involved loading in the CSV data directly from a web resource, and then plotting lines and bars to the same plot.

Bunny

Support this work

You can access this notebook and more by getting the e-book, Data is Beautiful.