Get the Books
Enjoying these notebooks and want to support the work? Check out the practical books on Data Science, Visualisation, and Evolutionary Algorithms.
Get the books
YAML for Configuration Files
Preamble¶
# used to create block diagrams
%reload_ext xdiag_magic
%xdiag_output_format svg
import os
Introduction¶
In this section, we're going to have a look at YAML, which is a recursive acronym for "YAML Ain't Markup Language". It is a data interchange file format that is often found with a .yaml
or .yml
file extension.
What this section is most interested in is using YAML for configuration files, enabling us to extract parameters that we use within our programs so that they can be separated. This means that configuration files can be shared between scripts/programs, and they can be modified without needing to modify source code.
%%blockdiag
{
orientation = portrait
"config.yaml" <-> "notebook.ipynb"
"config.yaml" [color = '#ffffcc']
}
Of course, there are many alternatives such as JavaScript Object Notation (JSON) or Tom's Obvious, Minimal Language (TOML), and they all have their advantages and disadvantages. We won't do a full comparison of YAML vs. alternatives, but some advantages of YAML are:
- It is human-readable, making it easy for someone to read or create them.
- Many popular programming languages have support for managing YAML files.
- YAML is a superset of JSON, meaning that JSON can be easily converted to YAML if needed.
We can see that we have some key-value mappings, where the key appears before the colon and the value appears after. The first key we have is learning_rate
with a value of 0.1
, the second is random_seed
with a value of 789108
, and the third is maintainer
with a value of Shahin Rostami
. Finally, I have included a key mapping to a sequence, where categories
has the value of list which contains the items "hotdog"
and "not a hotdog"
.
You can read more about YAML at the YAML Specification web page.
So that we can work with this file later on in this section, you can paste the above YAML into a file in the same directory as this notebook and name it config.yaml
. Alternatively, you can just run the cell below which will do it for you.
config_string = '''learning_rate: 0.1
random_seed: 789108
maintainer: Shahin Rostami
categories:
- hotdog
- not a hotdog'''
with open('config.yaml', 'w') as f:
f.write(config_string)
Getting Python Ready for YAML¶
Before we begin working with YAML files in Python, we need to make sure we have PyYAML installed. There are alternatives to PyYAML available, but they may not be compatible with the following instructions. However, you can use ruamel.yaml
as a drop-in replacement if you wish.
Some options to install PyYAML are with:
Anaconda¶
conda install -c conda-forge pyyaml
pip¶
pip install PyYAML
Once you have the package installed you should be ready to import the PyYAML package within Python.
import yaml
Loading YAML with Python¶
It is surprisingly easy to load a YAML file into a Python dictionary with PyYAML.
with open('config.yaml') as f:
config = yaml.load(f, Loader=yaml.FullLoader)
We can confirm that it worked by displaying the contents of the config
variable.
config
It's as easy as that. We can now access the various elements of the data structure like a normal Python dictionary.
config['learning_rate']
config['categories']
config['categories'][0]
Updating YAML with Python¶
Let's say we want to update our learning_rate
to 0.2
and add an extra category to our category list. We can do this using the normal Python dictionary manipulation.
config['learning_rate'] = 0.2
config['categories'].append('kind of hotdog')
We can then write this back to the config.yaml
file to save our changes.
with open('config.yaml', 'w') as f:
config = yaml.dump(config, stream=f,
default_flow_style=False, sort_keys=False)
All done! We can confirm this by loading our YAML File again and displaying the dictionary.
with open('config.yaml') as f:
config = yaml.load(f, Loader=yaml.FullLoader)
config
Conclusion¶
In this section, we briefly introduced YAML before using the PyYAML package to load, manipulate, and save a collection of configuration settings that we stored in a file named config.yaml
. Keeping your configuration settings separate from your source code comes with multiple benefits, e.g. allowing modification of these configurations without modifying source code, automation and search throughout your project, and sharing configurations between multiple bits of work.
Support this work
You can support this work by getting the e-books. This notebook will always be available for free in its online format.
Get the Books
Enjoying these notebooks and want to support the work? Check out the practical books on Data Science, Visualisation, and Evolutionary Algorithms.
Get the books