The LSTM Reber grammar example

Reber grammar is a simple string generator that will be used to showcase the LSTM. In this case it will be the more complex embedded version of Reber grammar with long term dependencies. The above is the string generator. In the examples above the make_reber_set is dispatched to generate random 3000 unique strings and then…

The pretrained net

Using the WTA function one can pretrain the layers of a net and then fine tune them together with the sigmoid layer added on top. This does not require any additions the library apart from the BlockReverse() type. First we create all the layers individually and then we create arrays of such layers grouped together,…

The WTA autoencoder

  A few months ago, I spent an enormous amount of time implementing the k-sparse autoencoder as practice for machine learning. In fact, for this new year, I wanted to make this the subject of the autoencoder tutorial, but the method in the paper suffers from some drawbacks. It was fun playing around, but to…

The Mnist Feedforward net example

A few posts ago I showed a feedforward pass for a neural net on the XOR example, but no reverse pass. This here is pretty much that neural net except with reverse pass using the spiral library. These are the contents of the load_mnist_fsx: The code just loads the Mnist dataset from the given file…

Main types, the tape and a few function examples

The following should be familiar from the basics of AD. There are some differences, the main being is that for some nodes we do not want to calculate the adjoints (because they might be input or target nodes) so we need a flag for that. The rest is boilerplate. At the time of writing, there…

Get and set slice modules

The following is just going to be a big code dump and there is no need to think about this too deeply. Even though it is 200 lines long, all the above does is lets us access matrix like a 2D array. With this extension it can be read and set using .[1..3,2..5] or something…

Basics of automatic differentiation

In the last post I gave an example of a true feedforward net if a small one. All neural nets are really just a class of functions R^n -> R. The XOR network rewritten in more mathy notation would be: y = sum((targets-sigmoid(W2 * tanh(W*input+bias) + bias2))^2) The challenge of optimizing this neural net is…