Example: Using the Standard ESN#
This tutorial shows how to use the ESN class, covering:
building and training the network
making sample predictions and comparing the predictions to test data
storing the network weights and re-reading the network to an ESN
[1]:
import numpy as np
import matplotlib.pyplot as plt
[2]:
plt.style.use("./xesn.mplstyle")
Create training and testing data#
Here we generate a time series using the 6 Dimensional version of the Lorenz 96 model [Lorenz, 1996]. First we spinup the model dynamics before making the training data, then discard a brief transient period, and finally create the test dataset.
This uses a local module here in the documentation directory.
[3]:
from lorenz import Lorenz96
[4]:
model = Lorenz96(N=6)
[5]:
n_spinup = 2_000
n_train = 40_000
n_transient = 1_000
n_test = 1_000
n_total = n_spinup+n_train+n_transient+n_test
rs = np.random.RandomState(0)
trajectory = model.generate(n_steps=n_total, x0=rs.normal(size=(model.N,)))
[6]:
trainer = trajectory.isel(time=slice(n_spinup,n_spinup+n_train+1))
tester = trajectory.isel(time=slice(-n_test-1, None))
Normalization#
Here we normalize the data very simply, with the training mean and standard deviation. Note that we do not normalize each spatial dimension separately, following [Platt et al., 2022] and [Smith et al., 2023].
[7]:
bias = trainer.mean()
scale = trainer.std()
[8]:
trainer = (trainer - bias)/scale
tester = (tester - bias)/scale
Create the ESN#
Here we create a network that fully emulates the 6D Lorenz96 system, so it has input and output dimension equal to 6. The other parameters have been obtained from previous experiments, and are used just as an example.
[9]:
from xesn import ESN
[10]:
esn = ESN(
n_input=model.N,
n_output=model.N,
n_reservoir=500,
leak_rate=0.87,
tikhonov_parameter=6.9e-7,
input_kwargs={
"factor": 0.86,
"normalization": "svd",
"distribution": "uniform",
"random_seed": 0,
},
adjacency_kwargs={
"factor":0.71,
"normalization": "eig",
"distribution": "uniform",
"is_sparse": True,
"connectedness": 5,
"random_seed": 1,
},
bias_kwargs={
"factor": 1.8,
"distribution": "uniform",
"random_seed": 2,
},
)
And note that we can always get a printed view of the values set in the ESN
[11]:
esn
[11]:
ESN
n_input: 6
n_output: 6
n_reservoir: 500
---
leak_rate: 0.87
tikhonov_parameter: 6.9e-07
---
Input Matrix:
factor 0.86
normalization svd
distribution uniform
random_seed 0
---
Adjacency Matrix:
factor 0.71
normalization eig
distribution uniform
is_sparse True
connectedness 5
random_seed 1
---
Bias Vector:
factor 1.8
distribution uniform
random_seed 2
Note: It is usually a good idea to use the random_seed to control the randomly generated matrices, so that results are consistent. Once reasonable performance is obtained, the seed can be varied to determine how dependent the results are on the exact matrices and vector chosen.
Build the network and train the readout matrix#
The .build() method creates the two random matrices and the random bias vector. Once these are generated, we can run .train() to learn the readout matrix weights.
[12]:
esn.build()
[13]:
%%time
esn.train(trainer, batch_size=5_000)
CPU times: user 15.8 s, sys: 862 ms, total: 16.7 s
Wall time: 1.8 s
Note: the batch_size parameter should be chosen so that batch_size x n_reservoir can fit comfortably in memory. Here we choose a somewhat conservative value.
Test the network#
Here we use .test() to make a sample prediction, and package that prediction together with the original test data. In order to only make a sample prediction, and not return the test data as well, see .predict().
[14]:
xds = esn.test(tester, n_steps=500, n_spinup=500)
xds
[14]:
<xarray.Dataset>
Dimensions: (x: 6, ftime: 501)
Coordinates:
* x (x) int64 0 1 2 3 4 5
* ftime (ftime) float64 0.0 0.01 0.02 0.03 0.04 ... 4.97 4.98 4.99 5.0
time (ftime) float64 435.0 435.0 435.0 435.0 ... 440.0 440.0 440.0
Data variables:
prediction (x, ftime) float64 -0.09683 -0.08236 -0.06422 ... -1.203 -1.164
truth (x, ftime) float64 -0.09683 -0.08235 -0.06419 ... -1.321 -1.287
Attributes:
n_input: 6
n_output: 6
n_reservoir: 500
leak_rate: 0.87
tikhonov_parameter: 6.9e-07
input_kwargs: {'factor': 0.86, 'normalization': 'svd', 'distributi...
adjacency_kwargs: {'factor': 0.71, 'normalization': 'eig', 'distributi...
bias_kwargs: {'factor': 1.8, 'distribution': 'uniform', 'random_s...
esn_type: ESN
description: Contains a test prediction and matching truth trajec...Perform the data normalization inverse
[15]:
for key in xds.data_vars:
xds[key] = xds[key]*scale + bias
Plot the results
[16]:
nrows = len(xds.x)
fig, axs = plt.subplots(nrows, 1, figsize=(8, nrows*1), constrained_layout=True, sharex=True, sharey=True)
for i, ax in enumerate(axs):
xds["truth"].isel(x=i).plot(ax=ax, color="k", label="Truth")
xds["prediction"].isel(x=i).plot(ax=ax, label="Prediction")
ax.set(ylabel="", xlabel="", title="")
ax.set(xlabel="Forecast Time (MTU)")
axs[0].legend(loc=(.25,.85),ncol=2)
[16]:
<matplotlib.legend.Legend at 0x182c6e490>
Store the Trained Readout Weights#
First, we store the weights for future testing by creating an xarray.Dataset from the ESN with .to_xds().
[17]:
eds = esn.to_xds()
eds
[17]:
<xarray.Dataset>
Dimensions: (ir: 500, iy: 6)
Coordinates:
* ir (ir) int64 0 1 2 3 4 5 6 7 8 ... 492 493 494 495 496 497 498 499
* iy (iy) int64 0 1 2 3 4 5
Data variables:
Wout (iy, ir) float64 0.1931 0.3658 0.3793 ... -0.8165 0.04196 -0.1727
Attributes:
n_input: 6
n_output: 6
n_reservoir: 500
leak_rate: 0.87
tikhonov_parameter: 6.9e-07
input_kwargs: {'factor': 0.86, 'normalization': 'svd', 'distributi...
adjacency_kwargs: {'factor': 0.71, 'normalization': 'eig', 'distributi...
bias_kwargs: {'factor': 1.8, 'distribution': 'uniform', 'random_s...Note: the adjacency matrix, input matrix, and bias vector are not stored, but rather all the options needed to recreate them are. This means that true reproducibility is only possible by setting the random_seed options, as noted earlier.
[18]:
eds.to_zarr("esn-weights.zarr")
[18]:
<xarray.backends.zarr.ZarrStore at 0x182d21620>
Read in the Network and Show Reproducibility#
Here we use the from_zarr() function to read a zarr store with the ESN weights. Note that this creates the ESN and runs the .build method to create the network.
[19]:
from xesn import from_zarr
[20]:
esn2 = from_zarr("esn-weights.zarr")
[21]:
y2 = esn2.predict(tester, n_steps=500, n_spinup=500)
y2 = y2*scale + bias
[22]:
np.abs(xds["prediction"]-y2).max()
[22]:
<xarray.DataArray ()> array(6.49610588e-10)
Cleanup#
[23]:
from shutil import rmtree
[24]:
rmtree("esn-weights.zarr")
[ ]: