Data Analysis with Rust Notebooks

A practical book on Data Analysis with Rust Notebooks that teaches you the concepts and how they’re implemented in practice.

Get the book
Better Plotting with Plotly

Preamble

:dep plotly = {version = "0.4.0"}
:dep nanoid = {version = "0.3.0"}
extern crate plotly;
extern crate nanoid;

use plotly::{Plot, Scatter};
use plotly::common::{Mode};
use nanoid::nanoid;
use std::fs;

let plotly_file = "temp_plot.html";

Introduction

In the last section, we covered how to get plotting with Ploty using Plotly for Rust paired with our very own workaround. If you continued experimenting with this approach before starting this section you may have encountered some limitations:

  • File size. The notebook file from the previous section, plotting-with-plotly.ipynb, weighed in at around 3.4 MB. This is an unusually large file for what was only a few paragraphs and a single interactive plot.
  • Multiple plots. If you tried to output a second Plotly plot in the same notebook, only the first one would be rendered.
  • File size, again. If you did solve the issue regarding multiple plots, your file size would grow linearly for every plot output. A second plot would take you from 3.4 MB to 6.8 MB.

We're going to improve our workaround so that we can produce many of our nice interactive plots without bloating our notebooks and any HTML files we may save to.

Example Plotly Plot

Let's use the code from the previous section to generate our plot. We will then save this to a file as HTML, and load it back into a string for further processing.

let trace1 = Scatter::new(vec![1, 2, 3, 4], vec![10, 15, 13, 17])
    .name("trace1")
    .mode(Mode::Markers);

let trace2 = Scatter::new(vec![2, 3, 4, 5], vec![16, 5, 11, 9])
    .name("trace2")
    .mode(Mode::Lines);

let trace3 = Scatter::new(vec![1, 2, 3, 4], vec![12, 9, 15, 12]).name("trace3");

let mut plot = Plot::new();

plot.add_trace(trace1);
plot.add_trace(trace2);
plot.add_trace(trace3);

plot.to_html(plotly_file);

let plotly_contents = fs::read_to_string(plotly_file).unwrap();

Reducing the File Size

If you open the HTML output that was saved to temp_plot.html, you may notice that the entire contents of plotly.js have also been embedded. This will be true for all output created by Plotly for Rust's .to_html() function. This also means that if we have two of these plots in our notebook using the workaround, we will have two copies of plotly.js also embedded. Because we're using the Plotly Jupyter Lab extension, @jupyterlab/plotly-extension, we don't need to embed plotly.js at all.

So let's extract the part of this HTML file that we actually need. We can do this by slicing out a substring starting from one part of the string that we know starts off the part we need, <div id=\"plotly-html-element\" class=\"plotly-graph-div

let start_bytes = plotly_contents
    .find("<div id=\"plotly-html-element\" class=\"plotly-graph-div\"")
    .unwrap_or(0);

and ending at another part that we know immediately follows the last part we need </div></body></html>.

let end_bytes = plotly_contents
    .find("\n</div>\n</body>\n</html>")
    .unwrap_or(plotly_contents.len());

Let's print out our substring to see what we've ended up with.

&plotly_contents[start_bytes..end_bytes]
"
\n
\n "

This now looks to be dramatically smaller in file size.

Allowing Multiple Plots

However, you may have noticed a clue as to why we can only properly output a single Plotly plot per notebook. This is because of <div id=\"plotly-html-element\", meaning that every plot will have the same ID. In the Python version of Plotly, each plot has a randomly generated ID, so let's do the same using nanoid.

nanoid!()
"FcH20qHUuEAuGEY6mO0iy"

If we replace every occurrence of the original ID, plotly-html-element, with a new one generated by nanoid, then we should be able to output multiple plots.

&plotly_contents[start_bytes..end_bytes]
    .replace("plotly-html-element", Box::leak(nanoid!().into_boxed_str()))
"
\n
\n "

Archived: Loading Plotly with RequireJS

Note

This subsection was written for a previous revision of this book. Back then, Jupyter Lab was still at version 1, Plotly for Rust did not have Jupyter Notebook suport, and the RequireJS extension for Jupyter Lab was still supported. Unless you are curious about this now historical workaround, you should skip onto the next section, Putting Everything Together.

Now that we've stopped embedding the entire contents of plotly.js in our notebooks, we'll need some way to load in plotly.js to view our visualisations. There are many different solutions to this problem, such as the @jupyterlab/plotly-extension Jupyter Lab extension that was previously used in this book. However, a solution that is more suitable for our use cases is to use RequireJS, a JavaScript file and module loader, and the @jupyterlab_requirejs Jupyter Lab extension to view our visualisation within our notebooks.

To achieve this, we'll need to wrap our Plotly JavaScript like the following:

require(["plotly"], function(Plotly) {
    // Plotly code here
});           

We know our Plotly scripts will always begin with :

window.PLOTLYENV= 

and end in:

};\n\n\n    </script>

So let's take advange of this and use .replace() to inject our wrapper.

&plotly_contents[start_bytes..end_bytes]
    .replace("plotly-html-element", Box::leak(nanoid!().into_boxed_str()))
    .replace("window.PLOTLYENV=",
             "require([\"plotly\"], function(Plotly) { window.PLOTLYENV=")
    .replace("};\n\n\n    </script>","};\n\n\n});    </script>")
"
\n
\n "

Loading Plotly on Demand

Now that we've stopped embedding the entire contents of plotly.js in our notebooks, we'll need some way to load in plotly.js to view our visualisations. There are many different solutions to this problem, such as the @jupyterlab/plotly-extension or @jupyterlab_requirejs Jupyter Lab extensions that were previously used in this book. However, these are no longer supported.

To achieve this, we'll need to wrap our Plotly generated JavaScript in a function:

function show_plot() {
    // Plotly code here
}          

We'll then need to decide whether to load the plotly.js JavaScript library. We'll do this by loading it if the Plotly object doesn't exist, followed by a call to our function above that will show our plot:

if (typeof Plotly === "undefined") {
    var script = document.createElement("script");
    script.type = "text/javascript";
    script.src = "https://cdn.plot.ly/plotly-1.58.1.min.js";
    script.onload = function () {
        show_plot();
    };
    document.body.appendChild(script);
} else {
    show_plot();
}

We know our Plotly scripts will always begin with :

window.PLOTLYENV= 

and end in:

};\n\n\n    </script>

So let's take advange of this and use .replace() to inject our wrapper.

&plotly_contents[start_bytes..end_bytes]
    .replace("plotly-html-element", Box::leak(nanoid!().into_boxed_str()))
    .replace("window.PLOTLYENV=",
             "function show_plot() { window.PLOTLYENV=")
    .replace("};\n\n\n    </script>",
             "};\n\n\n}; if (typeof Plotly === \"undefined\"){
        var script = document.createElement(\"script\");
        script.type = \"text/javascript\";
        script.src = \"https://cdn.plot.ly/plotly-1.58.1.min.js\";
        script.onload = function () { show_plot() };
        document.body.appendChild(script);} else { show_plot() }</script>")
"
\n
\n "

Putting Everything Together

Let's put everything together and demonstrate our ability to output multiple plots.

The following will be the first plot.

let plot_id = Box::leak(nanoid!().into_boxed_str());
let plot_handle = format!("{}{}", "show_plot_", plot_id).replace("-","_");

println!("EVCXR_BEGIN_CONTENT text/html\n{}\nEVCXR_END_CONTENT",
    format!("<div>{}</div>",
        &plotly_contents[start_bytes..end_bytes]
            .replace("plotly-html-element", plot_id)
            .replace("height:100%; width:100%;", "")
            .replace(",height:0,width:0", "")
            .replace("height:0,width:0", "")
            .replace("window.PLOTLYENV=",
                     "function show_plot() { window.PLOTLYENV=")
            .replace("};\n\n\n    </script>",
                     "};\n\n\n}; if (typeof Plotly === \"undefined\"){
                var script = document.createElement(\"script\");
                script.type = \"text/javascript\";
                script.src = \"https://cdn.plot.ly/plotly-1.58.1.min.js\";
                script.onload = function () { show_plot() };
                document.body.appendChild(script);} else { show_plot() }</script>")
            .replace("show_plot", &plot_handle)));

The following will be the second plot.

let plot_id = Box::leak(nanoid!().into_boxed_str());
let plot_handle = format!("{}{}", "show_plot_", plot_id).replace("-","_");

println!("EVCXR_BEGIN_CONTENT text/html\n{}\nEVCXR_END_CONTENT",
    format!("<div>{}</div>",
        &plotly_contents[start_bytes..end_bytes]
            .replace("plotly-html-element", plot_id)
            .replace("height:100%; width:100%;", "")
            .replace(",height:0,width:0", "")
            .replace("height:0,width:0", "")
            .replace("window.PLOTLYENV=",
                     "function show_plot() { window.PLOTLYENV=")
            .replace("};\n\n\n    </script>",
                     "};\n\n\n}; if (typeof Plotly === \"undefined\"){
                var script = document.createElement(\"script\");
                script.type = \"text/javascript\";
                script.src = \"https://cdn.plot.ly/plotly-1.58.1.min.js\";
                script.onload = function () { show_plot() };
                document.body.appendChild(script);} else { show_plot() }</script>")
            .replace("show_plot", &plot_handle)));

We can now see two plots, with this notebook currently weighing in at a file size of only 16 KB. We can also see that I have surrounded the HTML for our ploty with a <div>, this was needed to ensure the full plot is visible in a notebook cell.

Finally, let's clean up by deleting our temporary HTML file.

fs::remove_file(plotly_file)?;

Conclusion

In this section, we've improved our workaround for data visualisation with Plotly for Rust in Jupyter notebooks. We achieved this by stripping out excess JavaScript to reduce the file size and generating random IDs to allow multiple plots. In the next section, we'll implement all of this into a single function so that we can visualise our data easily in the upcoming sections.

Comments

From the collection

Data Analysis with Rust Notebooks

A practical book on Data Analysis with Rust Notebooks that teaches you the concepts and how they’re implemented in practice.

Get the book

ISBN

978-1-915907-10-3

Cite

Rostami, S. (2020). Data Analysis with Rust Notebooks. Polyra Publishing.