Pokemon Types with Plotapi Chord

Preamble

In [1]:
from plotapi import Chord
import json

Chord.set_license("your username", "your license key")

Introduction

In this notebook we're going to use Plotapi Chord to visualise the co-occurrences between Pokémon types. We"ll use Python, but Plotapi can be used from any programming language.

In a chord diagram (or radial network), entities are arranged radially as segments with their relationships visualised by ribbons that connect them. The size of the segments illustrates the numerical proportions, whilst the size of the arc illustrates the significance of the relationships. Chord diagrams are useful when trying to convey relationships between different entities, and they can be beautiful and eye-catching.

Dataset

We're going to use Pokémon (Gen 1-8) data, a fork of which is available in this repository. Let"s get loading the data.

In [2]:
with open("pokemon_types.json", "r") as f:
    data = json.load(f)
    
names = ["Bug", "Dark", "Dragon", "Electric", "Fairy", "Fighting", "Fire", "Flying", "Ghost", "Grass", "Ground", "Ice", "Normal", "Poison", "Psychic", "Rock", "Steel", "Water"]

Visualisation

Let's use Plotapi Chord for this visualisation, you can see more examples in the Gallery.

We're going to adjust some layout and template parameters, and flip the intro animation on too.

Because we're using a data-table, we can also click on any part of the diagram to "lock" the selection.

In [5]:
Chord(
    data["matrix"],
    names,
    colors="monsters",
    details_thumbs=data["details_thumbs"],
    margin=30,
    noun="Pokemon!",
    thumbs_width=50,
    curved_labels=True,
    thumbs_margin=1,
    popup_width=600,
    arc_numbers=True,
    data_table_column_width=100,
    data_table=data["data_table"],
    animated_intro=True
).show()
Plotapi - Chord Diagram

Pokémon Types and Type Combinations

Preamble

In [1]:
import numpy as np                   # for multi-dimensional containers 
import pandas as pd                  # for DataFrames
import itertools
from chord import Chord

Introduction

In previous sections, we visualised co-occurrences of Pokémon type. Whilst it was interesting to look at, the dataset only contained Pokémon from the first six geerations. In this section, we're going to use the Pokemon with stats Generation 8 dataset to visualise the co-occurrence of Pokémon types from generations one to eight.

The Dataset

The dataset documentation states that we can expect 13 variables per each of the 1017 Pokémon of the first eight generations.

Let's download the mirrored dataset and have a look for ourselves.

In [2]:
data_url = 'https://datacrayon.com/datasets/pokemon_gen_1_to_8.csv'
data = pd.read_csv(data_url)
data.head()


data['name'] = data['name'].str.slice(0,10)

It looks good so far, but let's confirm the 13 variables against 1017 samples from the documentation.

In [3]:
data.shape
Out[3]:
(1028, 51)

Perfect, that's exactly what we were expecting.

Data Wrangling

We need to do a bit of data wrangling before we can visualise our data. We can see from the columns names that the Pokémon types are split between the columns Type 1 and Type 2.

In [4]:
pd.DataFrame(data.columns.values.tolist())
Out[4]:
0
0 Unnamed: 0
1 pokedex_number
2 name
3 german_name
4 japanese_name
5 generation
6 status
7 species
8 type_number
9 type_1
10 type_2
11 height_m
12 weight_kg
13 abilities_number
14 ability_1
15 ability_2
16 ability_hidden
17 total_points
18 hp
19 attack
20 defense
21 sp_attack
22 sp_defense
23 speed
24 catch_rate
25 base_friendship
26 base_experience
27 growth_rate
28 egg_type_number
29 egg_type_1
30 egg_type_2
31 percentage_male
32 egg_cycles
33 against_normal
34 against_fire
35 against_water
36 against_electric
37 against_grass
38 against_ice
39 against_fight
40 against_poison
41 against_ground
42 against_flying
43 against_psychic
44 against_bug
45 against_rock
46 against_ghost
47 against_dragon
48 against_dark
49 against_steel
50 against_fairy

So let's select just these two columns and work with a list containing only them as we move forward.

In [5]:
types = pd.DataFrame(data[['type_1', 'type_2']].values)
types
Out[5]:
0 1
0 Grass Poison
1 Grass Poison
2 Grass Poison
3 Grass Poison
4 Fire NaN
... ... ...
1023 Fairy NaN
1024 Fighting Steel
1025 Fighting NaN
1026 Poison Dragon
1027 Poison Dragon

1028 rows × 2 columns

Without further investigation, we can see that we have at least a few NaN values in the table above. We are only interested in co-occurrence of types, so we can remove all samples which contain a NaN value.

In [6]:
#types = types.dropna()

We can also see an instance where the type Fighting at index $1014$ is followed by \n. We'll strip all these out before continuing.

In [7]:
types = types.replace('\n','', regex=True)
types
Out[7]:
0 1
0 Grass Poison
1 Grass Poison
2 Grass Poison
3 Grass Poison
4 Fire NaN
... ... ...
1023 Fairy NaN
1024 Fighting Steel
1025 Fighting NaN
1026 Poison Dragon
1027 Poison Dragon

1028 rows × 2 columns

In [ ]:
 

Our chord diagram will need two inputs: the co-occurrence matrix, and a list of names to label the segments.

First we'll populate our list of type names by looking for the unique ones.

In [8]:
names =np.unique(pd.concat([types[col] for col in types]).dropna()).tolist()
pd.DataFrame(names)
Out[8]:
0
0 Bug
1 Dark
2 Dragon
3 Electric
4 Fairy
5 Fighting
6 Fire
7 Flying
8 Ghost
9 Grass
10 Ground
11 Ice
12 Normal
13 Poison
14 Psychic
15 Rock
16 Steel
17 Water

Now we can create our empty co-occurrence matrix using these type names for the row and column indeces.

We can populate a co-occurrence matrix with the following approach. We'll start by creating a list with every type pairing in its original and reversed form.

In [9]:
types
Out[9]:
0 1
0 Grass Poison
1 Grass Poison
2 Grass Poison
3 Grass Poison
4 Fire NaN
... ... ...
1023 Fairy NaN
1024 Fighting Steel
1025 Fighting NaN
1026 Poison Dragon
1027 Poison Dragon

1028 rows × 2 columns

Which we can now use to create the matrix.

In [10]:
##for t in types:
#    matrix.at[t[0], t[1]] += 1
    
#matrix = matrix.values.tolist()

We can list the DataFrame for better presentation.

Chord Diagram

Time to visualise the co-occurrence of types using a chord diagram. We are going to use a list of custom colours that represent the types.

In [11]:
colors = ["#A6B91A", "#705746", "#6F35FC", "#F7D02C", "#D685AD",
          "#C22E28", "#EE8130", "#A98FF3", "#735797", "#7AC74C",
          "#E2BF65", "#96D9D6", "#A8A77A", "#A33EA1", "#F95587",
          "#B6A136", "#B7B7CE", "#6390F0"]
In [12]:
names
Out[12]:
['Bug',
 'Dark',
 'Dragon',
 'Electric',
 'Fairy',
 'Fighting',
 'Fire',
 'Flying',
 'Ghost',
 'Grass',
 'Ground',
 'Ice',
 'Normal',
 'Poison',
 'Psychic',
 'Rock',
 'Steel',
 'Water']

Finally, we can put it all together.

Chord Diagram with Names

It would be nice to show a list of Pokémon names and images when hovering over co-occurring Pokémon types. To do this, we can make use of the optional details parameter.

Let's clean up our dataset by removing all instances of \n.

In [13]:
data = data.replace('\n','', regex=True)

Let's also add a column to our dataset to store URLs that point to the images.

In [14]:
data['URL'] = ""

for index, row in data.iterrows():
    dex = f"{row['pokedex_number']:03d}"
    url = f"https://datacrayon.com/images/data-is-beautiful/pokemon_thumbs/{dex}.png"
    data.at[index,'URL'] = url

Next, we'll create an empty multi-dimensional arrays with the same shape as our matrix for our details and thumbnail images.

In [ ]:
 
In [ ]:
 

Now we can populate the details array with lists of Pokémon names in the correct positions.

In [15]:
matrix = pd.DataFrame(0, index=names, columns=names)

details = pd.DataFrame([[[] for i in range(len(names))] for i in range(len(names))], index=names, columns=names)

details_thumbs = pd.DataFrame([[[] for i in range(len(names))] for i in range(len(names))], index=names, columns=names)

details



names_nested = names.copy()

for count_x, item_x in enumerate(names):
    
    for count_y, item_y in enumerate(names_nested):
        details_urls = data[
            (data['type_1'].isin([item_x, item_y])) &
            (data['type_2'].isin([item_y, item_x]))]['URL'].to_list()
        
        details_names = data[
            (data['type_1'].isin([item_x, item_y])) &
            (data['type_2'].isin([item_y, item_x]))]['name'].to_list()
        
        urls_names = np.column_stack((details_urls, details_names))
        if(urls_names.size > 0):
            details[item_y][item_x] = details_names
            details_thumbs[item_y][item_x] = details_urls
            
            matrix[item_y][item_x] = len(details_names)


        #else:
            #details[item_x][item_y] = []
            #details_thumbs[item_x][item_y] = []
   # names_nested.remove(item_x)
            
            
for count_x, item_x in enumerate(names):
    
    details_urls = data[
    (data['type_1'].isin([item_x])) &
    (data['type_2'].isna())]['URL'].to_list()

    details_names = data[
        (data['type_1'].isin([item_x])) &
        (data['type_2'].isna())]['name'].to_list()

    urls_names = np.column_stack((details_urls, details_names))
    if(urls_names.size > 0):
        details[item_x][item_x] = details_names
        details_thumbs[item_x][item_x] = details_urls

        matrix[item_x][item_x] = len(details_urls)

    #else:
     #   details[item_x][item_x] = []
      #  details_thumbs[item_x][item_x] = []
    
details=pd.DataFrame(details).values.tolist()
details_thumbs=pd.DataFrame(details_thumbs).values.tolist()
In [16]:
matrix.sum()
Out[16]:
Bug          90
Dark         70
Dragon       72
Electric     72
Fairy        63
Fighting     70
Fire         82
Flying      117
Ghost        66
Grass       117
Ground       80
Ice          55
Normal      126
Poison       77
Psychic     114
Rock         75
Steel        71
Water       153
dtype: int64

Finally, we can put it all together but this time with the details matrix passed in.

In [21]:
Chord(
    matrix.values.tolist(),
    names,
    colors=colors,
    details=details,
    details_thumbs=details_thumbs,
    margin=80,
    noun="Pokémon",
    thumbs_width=50,
    thumbs_margin=1,
    popup_width=600,
    thumbs_font_size=10,
    credit=True,
    arc_numbers=True,
    reverse_gradients=False,
    
).show()
Chord Diagram

Conclusion

In this section, we demonstrated how to conduct some data wrangling on a downloaded dataset to prepare it for a chord diagram. Our chord diagram is interactive, so you can use your mouse or touchscreen to investigate the co-occurrences!

In [18]:
matrix.values.tolist()
Out[18]:
[[19, 0, 0, 4, 2, 4, 4, 14, 1, 6, 2, 2, 0, 13, 2, 5, 7, 5],
 [0, 14, 4, 2, 3, 3, 4, 5, 3, 3, 3, 2, 5, 5, 3, 2, 2, 7],
 [0, 4, 12, 3, 1, 2, 3, 8, 5, 5, 9, 3, 1, 4, 5, 2, 2, 3],
 [4, 2, 3, 33, 2, 0, 1, 6, 1, 1, 1, 2, 2, 3, 1, 3, 4, 3],
 [2, 3, 1, 2, 19, 0, 0, 2, 1, 5, 0, 1, 5, 1, 9, 3, 5, 4],
 [4, 3, 2, 0, 0, 28, 7, 1, 1, 3, 0, 1, 4, 2, 6, 1, 4, 3],
 [4, 4, 3, 1, 0, 7, 34, 7, 5, 0, 4, 1, 2, 2, 3, 3, 1, 1],
 [14, 5, 8, 6, 2, 1, 7, 4, 3, 7, 4, 2, 27, 3, 7, 6, 3, 8],
 [1, 3, 5, 1, 1, 1, 5, 3, 14, 12, 6, 1, 0, 4, 3, 0, 4, 2],
 [6, 3, 5, 1, 5, 3, 0, 7, 12, 43, 1, 3, 2, 15, 3, 2, 3, 3],
 [2, 3, 9, 1, 0, 0, 4, 4, 6, 1, 17, 3, 1, 2, 2, 9, 6, 10],
 [2, 2, 3, 2, 1, 1, 1, 2, 1, 3, 3, 19, 0, 0, 4, 2, 2, 7],
 [0, 5, 1, 2, 5, 4, 2, 27, 0, 2, 1, 0, 71, 0, 5, 0, 0, 1],
 [13, 5, 4, 3, 1, 2, 2, 3, 4, 15, 2, 0, 0, 16, 0, 1, 0, 6],
 [2, 3, 5, 1, 9, 6, 3, 7, 3, 3, 2, 4, 5, 0, 44, 2, 9, 6],
 [5, 2, 2, 3, 3, 1, 3, 6, 0, 2, 9, 2, 0, 1, 2, 16, 7, 11],
 [7, 2, 2, 4, 5, 4, 1, 3, 4, 3, 6, 2, 0, 0, 9, 7, 11, 1],
 [5, 7, 3, 3, 4, 3, 1, 8, 2, 3, 10, 7, 1, 6, 6, 11, 1, 72]]
In [19]:
matrix['Bug']
Out[19]:
Bug         19
Dark         0
Dragon       0
Electric     4
Fairy        2
Fighting     4
Fire         4
Flying      14
Ghost        1
Grass        6
Ground       2
Ice          2
Normal       0
Poison      13
Psychic      2
Rock         5
Steel        7
Water        5
Name: Bug, dtype: int64

Path Construction with 2D Arrays

Highlight

Whilst you could say that it's possible to draw a zigzag using multiple rect elements at different positions and rotations, it is certainly an infeasible and inefficient exercise. This is where the SVG path element comes in. A path describes an outline of some shape that can be filled and/or stroked.

Preamble

Let's get access to the D3.js library so that we can begin. In this case, we'll be including the library using the HTML <script> tag.

<script src="https://d3js.org/d3.v7.js"></script>

Introduction

Up until this section, we've been making use of several basic shapes as they are defined in the W3C specification. These include the rect, circle, and ellipse1. In this section, however, we'll look at how we can create custom and complex shape using the SVG path element2.

Let's use the creation of a zigzag as an example. Whilst you could say that it's possible to draw a zigzag using multiple rect elements at different positions and rotations (in the same way that you could draw any image with an unbounded number of rect elements behaving as pixels), it is certainly an infeasible and inefficient exercise.

This is where the SVG path element comes in. A path describes an outline of some shape that can be filled and/or stroked. Alternatively, a path could be left without a fill or stroke, so that it can be used to position text or define an animation path.

Paths are drawn as if a pen has been placed on paper at a current point, whereby following instructions move that pen in either lines or curves.

<svg>
  <path d="M20,60L60,20L100,60L140,20L180,60L220,20"
        stroke="black" fill="none"></path>
</svg>

We can see the output of the above SVG markup is a zigzag pattern. This shape has been specified using the d (data) property. In this case, we've used a series of moveto (M) and lineto (L) commands to construct our shape using coordinates.

We can see that the fill property has been explicitly set to "none". Otherwise, the default behaviour would be to fill the shape.

As we can see, this is not the desired output for our zigzag shape.

Let's see how we can create the same complex shape using D3.js.

A Container for the Output

This is where you will see the output of the code cells that follow it, provided they are referencing the corresponding id.

<div id="container"></div>

Creating an Empty SVG

We'll create a new detached <svg> element and use the returned selection throughout the rest of this section.

const svg = d3.create("svg");

Creating a Complex Shape with Paths

To create the same zigzag as above we need to complete three steps.

  1. Populate a data structure with our coordinates.
  2. Construct a line generator with d3.line()3.
  3. Generate a line by passing our populated data structure to the line generator.

Let's get started.

Populating a 2D Array

We'll use a simple 2D array for our data structure, passing in the coordinates that specify our zigzag shape. We'll store this in the data variable.

var data = [
    [20, 60],
    [60, 20],
    [100, 60],
    [140, 20],
    [180, 60],
    [220, 20]
];

Constructing the Line Generator

Next, we'll need to construct our line generator using d3.line(). We'll store this in the lineFun variable.

var lineFun = d3.line();

Generate the Path

Then we'll generate the zigzag line by passing our path data into our line generator, i.e. lineFun(data). This will be used to set the d (or data) property of our path element.

To create a <path> element with D3.js we can invoke the d3.append(name) function on our svg selection and pass in the name of the element. We'll also specify the stroke and fill properties.

var line = svg.append("path")
    .attr("d", lineFun(data))
    .attr("stroke", "black")
    .attr("fill", "none");

Appending to the Container

Finally, let's append everything to our container.

d3
    .select("#container")
    .append(() => svg.node());

We can see the output by checking on our container with the corresponding id, which in this case is where id=container.

Conclusion

If we inspect the HTML, we will see the <svg> and <path> elements have been added to the <div> where the id=container. We can also see that the <path> element's d attribute contains the path data that specifies our zigzag. We also have the stroke set to black, and the fill set to none.

<div id="container">
    <svg>
        <path d="M20,60L60,20L100,60L140,20L180,60L220,20"
        stroke="black" fill="none"></path>
    </svg>
</div>

  1. W3C. Basic Shapes, https://www.w3.org/TR/SVG/shapes.html. 

  2. W3C. Paths, https://www.w3.org/TR/SVG/paths.html. 

  3. M. Bostock. d3-shape: Lines https://github.com/d3/d3-shape#lines. 

MacBook Butterfly Keyboard Problems

Unshaky

Over the last three years, I've occasionally been using a MacBook Pro 13-inch 2018. Unfortunately, it has been the worst laptop experience of my existence!

It was an upgrade to a MacBook Pro 13-inch 2015, and aside from the Apple Touch Bar, it would be fair to consider it superior to its predecessor. However, I started noticing something unusual once I started using it for writing - I kept making mistakes.

Like many, I'm a touch typist and quite comfortable using most keyboards. If I switch to a new keyboard, it sometimes takes some time to get used to the shape of the keys, the spacing between them, and how they feel. So at first, I assumed this unusual feeling came from typing on a brand new keyboard. Over time, I noticed a pattern in the mistakes I was making. Sometimes I would double type a letter, e.g. e, or I would miss the letter entirely.

It starteed gtting on my neervs.

My typing slowed down as I was often typing in anticipation of these mistakes, and it became an overall distraction.

I found similar reports shared online, and I was relieved at the suggestion that it might be a hardware issue with the "butterfly keyboard", rather than a deterioration of my ability!

Through this search, I also discovered an attempt to address the issue with software. It's called Unshaky, and it attempts to dismiss erroneous key press registrations, i.e. those that occur no later than x milliseconds after the previous one. It's not perfect and it doesn't solve all the issues, but in the following screenshot you can see some of what it caught over just a few months:

It has been a terrible experience in general, and I look forward to leaving this machine behind me. What follows is a list of keyboard complaints taken from the Unshaky README1


  1. Unshaky https://github.com/aahung/Unshaky. 

Chord Pro Features For Chord Diagrams

Preamble

In [1]:
from chord import Chord

Introduction

Note

This document is best viewed online where the interactivity of the demonstrations can be experienced.

In a chord diagram (or radial network), entities are arranged radially as segments with their relationships visualised by arcs that connect them. The size of the segments illustrates the numerical proportions, whilst the size of the arc illustrates the significance of the relationships1. Chord diagrams are useful when trying to convey relationships between different entities, and they can be beautiful and eye-catching.

Get Chord Pro

Click here to get access to the full-featured chord visualization API, producing beautiful interactive visualizations, e.g. those featured on the front page of Reddit.

chord pro

License

To switch to the PRO version of the chord package, you need to assign a valid username (the email you entered at purchase) and license key. This can be purchased here.

In [ ]:
Chord.user = "your username"
Chord.key = "your license key"

We'll use the following data for the co-occurrence matrix and names parameters until we cover divided diagrams.

In [2]:
matrix = [
    [0, 5, 6, 4, 7, 4],
    [5, 0, 5, 4, 6, 5],
    [6, 5, 0, 4, 5, 5],
    [4, 4, 4, 0, 5, 5],
    [7, 6, 5, 5, 0, 4],
    [4, 5, 5, 5, 4, 0],
]

names = ["Action", "Adventure", "Comedy", "Drama", "Fantasy", "Thriller"]

Defaults

Without passing in any arguments for the customisation parameters, the output will use the default value.

In [3]:
Chord(matrix, names).show()
Chord Diagram

Outputs Methods

Chord Pro supports the following outputs.

HTML to Jupyer Lab Cell (Interactive)

Outputs the interactive diagram to a Jupyter Lab cell.

In [4]:
Chord(matrix, names, title="Jupyter Lab Cell").show()
Chord Diagram

HTML to file (Interactive)

Saves an interactive HTML file locally.

In [5]:
Chord(matrix, names).to_html('out.html')

[Plotapi.com feature] PNG to Jupyer Lab Cell (Image)

Outputs a PNG to a Jupyter Lab Cell.

In [6]:
Chord(matrix, names).show_png()

[Plotapi.com feature] PNG to file (image)

Saves a PNG file locally.

In [7]:
Chord(matrix, names).to_png('out.png')

[Plotapi.com feature] PDF to file

Saves a PDF file locally.

In [8]:
Chord(matrix, names).to_pdf('out.pdf')

Disable SSL Verification

verify_ssl=True Some users behind school/corporate networks may experience issues contacting the Chord API end-points. One workaround is to use verify_ssl=False. You will receive an InsecureRequestWarning with a link to further information.

In [9]:
Chord(matrix, names, verify_ssl=False, title="SSL Verification Disabled").show()
/Users/shahin/miniconda3/envs/analytics/lib/python3.9/site-packages/urllib3/connectionpool.py:1013: InsecureRequestWarning: Unverified HTTPS request is being made to host 'api.shahin.dev'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  warnings.warn(
Chord Diagram

Chord Colours

colors="d3.schemeSet1"

The default setting for the chord colours is d3.schemeSet1.

This can be changed to any of the sequential, diverging, or categorical colour schemes in d3-scale-chromatic, such as d3.schemeAccent, d3.schemeBlues[n], or d3.schemePaired :

In [10]:
Chord(matrix, names, colors="d3.schemeAccent").show()
Chord Diagram

The colors parameter also accepts a Python list of HEX colour codes.

In [11]:
grayscale = ["#222222", "#333333", "#4c4c4c", "#666666", "#848484", "#9a9a9a"]
Chord(matrix, names, colors=grayscale).show()
Chord Diagram

Opacity

opacity=0.8

This sets the opacity for the arcs when they are not selected (mouseover/touch).

In [12]:
Chord(matrix, names, opacity=0.2).show()
Chord Diagram

Reverse Gradients

opacity=0.8

This sets the direction of the arc gradients.

In [13]:
Chord(matrix, names, reverse_gradients=True).show()
Chord Diagram

Arc Numbers

arc_numbers=False

This sets the visibility of quantity labels on segments.

In [14]:
Chord(matrix, names, arc_numbers=True).show()
Chord Diagram

Diagram Title

titles=""

This sets the text and visibility of the diagram title.

In [15]:
Chord(matrix, names, title="Movie Genre Co-occurrence").show()
Chord Diagram

Padding

padding=0.01

This sets the padding between segments as a fraction of the circle.

In [16]:
Chord(matrix, names, padding=0.5).show()
Chord Diagram