Skip to content

Saving Your Network: The Persistence API

Andrew Dillon edited this page May 8, 2016 · 11 revisions

The PersistenceAPI Explained

Introduction

In this tutorial you will learn how to:

  • Save and load your HTM.java Networks
  • Write and read other HTM.java objects separately
  • Convert (serialize) them to an intermittent data structure (byte array)
  • CheckPoint them (quick snapshot while running)
  • Write / read them from an IOStream (using the lower level api: SerializerCore.java)

The PersistenceAPI describes a set of methods you can use to read and write your HTM.java Networks to and from a file or IOStream. This code is specialized to handle the saving and loading of HTM.java Network API objects (Networks, Regions, Layers, Connections, Parameters, etc). Almost all HTM.java objects are Persistable (subclass of Serializable, which offers pre and post operations for easy setup).

Note: If you are going to use the PersistenceAPI on a Network with an ObservableSensor, you must use a PublisherSupplier when setting up your sensor (more info here).


Usage

Store and Load

The PersistenceAPI is an object you get back from a static factory method on Persistence like this:

PersistenceAPI api = Persistence.get();

or

SerialConfig config = new SerialConfig(); // See constructors & class for defaults to change. More on this below
PersistenceAPI api = Persistence.get(config);

After obtaining a reference to the api, you can use it in four main ways as described above. First of all, there are two main convenience methods for saving and loading a Network specifically...

Network network = ...
api.store(network);

//...later

Network network = api.load(); // This is the simplest form. You can also specify a file or path using:
...
Network network = api.load("myFileName"); // Could be, "/path/to/myFileName" also...

Note: Please keep in mind that calling "store()" will shutdown and halt your network, and is for when you intend to stop working. Don't worry, you can always call "restart()" on it again, if needed.


Checkpoints

Your Network can be saved as often as you like. To CheckPoint your Network, simply call checkPointer() to receive a CheckPointOp (check point operator) which is an object you can subscribe to (an rx.Observable) as many times as you like. Each time you subscribe, it triggers a check point operation and notifies you (via the Observer you subscribe with), when the operation is complete. In the notification you'll receive a byte array containing the serialized Network, which you can use or discard as you see fit (because the Network is also saved to disk).

Network network = ...
PersistenceAPI api = Persistence.get();
...
CheckPointOp operator = api.checkPointer(network);
operator.checkPoint(
    new Observer<byte[]>() {                                                 // Create an Observer in-line
        @Override public void onCompleted() {}
        @Override public void onError(Throwable e) { e.printStackTrace(); }
        @Override public void onNext(byte[] bytes) {
            assertTrue(bytes != null && bytes.length > 10);                  // Do "whatever" with the byte[]
        }
    });

Longer running networks can save potentially large files - but don't worry, this can be managed also. Remember that SerialConfig object we hinted at earlier? In it you can specify to only save one check point file at a time (as opposed to saving all of them, which is the default).

SerialConfig config = new SerialConfig(); // Construct using the defaults
config.setOneCheckPointOnly(true);

PersistenceAPI api = Persistence.get(config);
api.checkPointer(network).checkPoint(
    new Observer<byte[]>() {                                                 // Create an Observer in-line
        @Override public void onCompleted() {}
        @Override public void onError(Throwable e) { e.printStackTrace(); }
        @Override public void onNext(byte[] bytes) {
            assertTrue(bytes != null && bytes.length > 10);                  // Do "whatever" with the byte[]
        }
    });

Custom Configuration

Each PersistenceAPI object is given a default SerialConfig, which you can "override" by adding your own. The SerialConfig allows you to specify:

  • File name of the Network save file (defaults to Network.ser)
  • The directory to which the Network file is saved (defaults to /yourhomedir/HTMNetwork)
  • File name of the check point file: Consists of 2 parts, the file name and the date.
  • The date format of the date portion of the check point file name.
  • setOneCheckPointOnly flag: Indicates to remove previous check point file automatically
  • StandardOpenOptions for the regular Network save. Controls file handling... (default is fine, only use this if custom behavior desired)

The SerialConfig contains settings for both the regular Network saving and CheckPointing as separate configurations. The Network configurations can be altered via the SerialConfig constructor. The CheckPoint settings can be modified via setters and getters on the SerialConfig object. The constructor arguments only refer to the regular Network saving, and not the check point file name anything else.

An example of the default format of the CheckPoint file is: Network_Checkpoint_2016-04-12_14-38-00.232

...and uses the following default format specifier string: "YYYY-MM-dd_HH-mm-ss.SSS"

...which you can change to any thing you like. The date portion of the default format is separated by an underscore from the name portion, and includes year-month-date_Hours-Mins-Seconds.Milliseconds.

These settings can be changed by calling the two methods:

config.setCheckPointFileName("Network_Checkpoint_");
config.setCheckPointTimeFormatString("YYYY-MM-dd_HH-mm-ss.SSS");

Generic PersistenceAPI Methods

Reading / Writing other HTM.java Persistables

Should the need ever arise, such as use with a distribution framework such as Flink, Akka, or Spark and you want to send around Inference objects, the PersistenceAPI offers generic methods to read and write HTM.java objects. These are:

  • read() - Reifies a Persistable from the specified file in the location configured by the SerialConfig passed in at construction time.
  • read(String fileName) - Reifies a Persistable from the specified file in the location configured by the {@link SerialConfig} passed in at construction time.
  • read(byte[] serializedBytes) - Loads a Persistable from the specified serialized byte array and returns the de-serialized Persistable.
  • write(T instance) - Persists the Persistable subclass to the file system using the pre-configured SerialConfig specified at construction time.
  • write(T instance, String fileName) - Persists the Persistable subclass to the file system using the pre-configured SerialConfig specified at construction time, and using the fileName passed in.

Usage of these methods is the same as the Network store() and load() methods above.

Writing and Reading to/from IOStreams

The SerializerCore object may be obtained from the PersistenceAPI and from it, you can use HTMObjectOutputs and HTMObjectInputs to read and write HTM.java Persistables to your given stream. For example:

// Write to your own stream

Inference inf = ...
OutputStream output = new FileOutputStream(new File("myfile"));
HTMObjectOutput writer = Persistence.get().serializer().getObjectOutput(output);
try {
    writer.writeObject(inf, inf.getClass());
    writer.flush();
} catch(IOException e) {
    throw e;
} finally {
    writer.close();
}

// Read from your own stream

InputStream input = new FileInputStream(new File("myfile"));
HTMObjectInput reader = Persistence.get().serializer().getObjectInput(input);
try {
    Class<?> aClass = ...  // Persistable subclass
    T t = (T) reader.readObject(aClass); // Where T is the Persistable subclass type (HTM.java object).
    return t;
} catch(Exception e) {
    throw e;
} finally {
    reader.close();
}