## Data Analysis with Rust Notebooks

A practical book on Data Analysis with Rust Notebooks that teaches you the concepts and how they’re implemented in practice.

# Multidimensional Arrays and Operations with NDArray

## Preamble¶

In [2]:
:dep ndarray = {version = "0.13.1"}
extern crate ndarray;

This module contains the most used types, type aliases, traits and functions that you can import easily as a group:

In [3]:
use ndarray::prelude::*;

This gives us access to the following: ArrayBase, Array, RcArray, ArrayView, ArrayViewMut, Axis, Dim, Dim, Dimension, Array0, Array1, Array2, Array3, Array4, Array5, Array6, ArrayD, ArrayView0, ArrayView1, ArrayView2, ArrayView3, ArrayView4, ArrayView5, ArrayView6, ArrayViewD, ArrayViewMut0, ArrayViewMut1, ArrayViewMut2, ArrayViewMut3, ArrayViewMut4, ArrayViewMut5, ArrayViewMut6, ArrayViewMutD, Ix0, Ix0, Ix1, Ix1, Ix2, Ix2, Ix3, Ix3, Ix4, Ix4, Ix5, Ix5, Ix6, Ix6, IxDyn, IxDyn, arr0, arr1, arr2, aview0, aview1, aview2, aview_mut1, ShapeBuilder, NdFloat, and AsArray.

## Introduction¶

The ndarray crate provides us with a multidimensional container that can contain general or numerical elements. If you're familiar with Python, then you can consider it to be similar to the numpy package. With ndarray we get our $n$-dimensional arrays, slicing, views, mathematical operations, and more. We'll need these in later sections to load in our datasets into containers that we can operate on and conduct our analyses.

## Creating Arrays¶

### From a Vector¶

Let's take a look at how we can create a two-dimensional ndarray Array from a Vec with the arr2() function.

In [4]:
arr2(&[[1.,2.,3.],
[4.,5.,6.]])
Out[4]:
[[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

It's as easy as that, This has given us a 2 by 3 array with our desired floating point values. We can also use the array! macro as a shorthand for creating an array.

In [5]:
array![[1.,2.,3.],
[4.,5.,6.]]
Out[5]:
[[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

### Filled with Zeros¶

We can also construct an array filled with zeros, we can do this with the zeros() function and pass in our desired shape.

In [6]:
Array2::<f64>::zeros((4,4))
Out[6]:
[[0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0]], shape=[4, 4], strides=[4, 1], layout=C (0x1), const ndim=2

### Filled with Ones¶

Similarly, we can also construct an array filled with ones, we can do this with the ones() function and pass in our desired shape.

In [7]:
Array2::<f64>::ones((4,4))
Out[7]:
[[1.0, 1.0, 1.0, 1.0],
[1.0, 1.0, 1.0, 1.0],
[1.0, 1.0, 1.0, 1.0],
[1.0, 1.0, 1.0, 1.0]], shape=[4, 4], strides=[4, 1], layout=C (0x1), const ndim=2

Let's create variables to store a 1D array and a 2D array for use in the following subsections.

In [8]:
let data_1D: Array1::<f32> = array![1.,2.,3.];

let data_2D: Array2::<f32> = array![[1.,2.,3.],
[4.,5.,6.]];

## Dimensions¶

It's often the case that we need to find out the dimensionality of our arrays. There are many ways to do this, and the following contains some of the common approaches.

### From Length¶

We can use Array.len() to return the shape along a single axis.

In [9]:
data_1D.len()
Out[9]:
3

This is simple enough if we have a one-dimensional array. However, for higher dimensions, we can see that for a len() returns the flattened length.

In [10]:
data_2D.len()
Out[10]:
6

If we want to get the length along one of the axes instead, e.g. the second one, we can use Array.len_of(Axis(n))

In [11]:
data_2D.len_of(Axis(1))
Out[11]:
3

### From Shape¶

Another approach is to use Array.shape() which returns more information.

In [12]:
data_2D.shape()
Out[12]:
[2, 3]

We can see it has returned an array that indicates the length along all of our axes. This can be indexed to get the length along a specific axis.

In [13]:
data_2D.shape()[1]
Out[13]:
3

## Indexing¶

Like most data structures, the indexing starts at $0$. To access the first element in our one-dimensional arrays we can do the following.

In [14]:
data_1D[0]
Out[14]:
1.0

For higher dimensions, we need to use a primitive array.

In [15]:
data_2D[[0,0]]
Out[15]:
1.0

Likewise, to access the second element in our one-dimensional arrays we need to index with $1$.

In [16]:
data_1D[1]
Out[16]:
2.0

Again, for our higher dimensions, we use a primitive array..

In [17]:
data_2D[[0,1]]
Out[17]:
2.0

To select the last element in our one-dimensional arrays we can index with Array.len() -1.

In [18]:
data_1D[data_1D.len() -1]
Out[18]:
3.0

But for our multidimensional arrays we need to use a primitive array and use Array.len_of(Axis(n)).

In [19]:
data_2D[[0, data_2D.len_of(Axis(1)) -1]]
Out[19]:
3.0

Alternatively, we could use Array.shape()[n].

In [20]:
data_2D[[0, data_2D.shape()[1] - 1]]
Out[20]:
3.0

## Mathematics¶

Let's look at some common mathematical operations that can operate on our arrays.

### Summing Array Elements¶

All elements in an array can be summed with sum().

In [21]:
data_2D.sum()
Out[21]:
21.0

We may instead wish to sum all elements along a specific axis in an array, e.g. the first axis.

In [22]:
data_2D.sum_axis(Axis(0))
Out[22]:
[5.0, 7.0, 9.0], shape=[3], strides=[1], layout=CF (0x3), const ndim=1

Or the second axis:

In [23]:
data_2D.sum_axis(Axis(1))
Out[23]:
[6.0, 15.0], shape=[2], strides=[1], layout=CF (0x3), const ndim=1

### Element-wise Operations¶

It's quite common to apply mathematical operations to each element of an array. Let's have a look at some examples.

We can add values, e.g. $1.0$, to every element.

In [24]:
&data_2D + 1.0
Out[24]:
[[2.0, 3.0, 4.0],
[5.0, 6.0, 7.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

We can also add the elements of one array to another.

In [25]:
&data_2D + &data_2D
Out[25]:
[[2.0, 4.0, 6.0],
[8.0, 10.0, 12.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

Finally, we can add a one-dimensional array to a two-dimensional array.

In [26]:
&data_2D + &data_1D
Out[26]:
[[2.0, 4.0, 6.0],
[5.0, 7.0, 9.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

Warning

When summing two arrays together they don't need to have the same shape, but their shapes must be compatible. This means we should be able to broadcast one array across another, i.e. they must be identical in the size of at least one dimension.

#### Subtraction¶

We can subtract values, e.g. $1.0$, from every element.

In [27]:
&data_2D - 1.0
Out[27]:
[[0.0, 1.0, 2.0],
[3.0, 4.0, 5.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

We can also subtract elements of one array from another.

In [28]:
&data_2D - &data_2D
Out[28]:
[[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

Finally, we can subtract a one-dimensional array from a two-dimensional array array.

In [29]:
&data_2D - &data_1D
Out[29]:
[[0.0, 0.0, 0.0],
[3.0, 3.0, 3.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

#### Multiplication¶

We can multiply every element by a value, e.g. by $2.0$.

In [30]:
&data_2D * 2.0
Out[30]:
[[2.0, 4.0, 6.0],
[8.0, 10.0, 12.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

We can also multiply every element of one array by another.

In [31]:
&data_2D * &data_1D
Out[31]:
[[1.0, 4.0, 9.0],
[4.0, 10.0, 18.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

#### Division¶

We can divide every element by a value, e.g. by $2.0$.

In [32]:
&data_2D / 2.0
Out[32]:
[[0.5, 1.0, 1.5],
[2.0, 2.5, 3.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

We can also divide every element of one array by another.

In [33]:
&data_2D / &data_1D
Out[33]:
[[1.0, 1.0, 1.0],
[4.0, 2.5, 2.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

#### Power¶

We can raise the elements in an array to a power, e.g. of $3.0$.

In [34]:
data_2D.mapv(|data_2D| data_2D.powi(3))
Out[34]:
[[1.0, 8.0, 27.0],
[64.0, 125.0, 216.0]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

#### Square root¶

We can calculate the square root of elements in an array. The specified data type must match.

In [35]:
data_2D.mapv(f32::sqrt)
Out[35]:
[[1.0, 1.4142135, 1.7320508],
[2.0, 2.236068, 2.4494898]], shape=[2, 3], strides=[3, 1], layout=C (0x1), const ndim=2

## Conclusion¶

In this section, we've introduced ndarray as a crate that gives us multidimensional containers and operations. We demonstrated how to create arrays, find out their dimensionality, index them, and how to invoke some basic mathematical operations.

Support this work

You can access this notebook and more by getting the e-book on Data Analysis with Rust Notebooks.

## Data Analysis with Rust Notebooks

A practical book on Data Analysis with Rust Notebooks that teaches you the concepts and how they’re implemented in practice.