Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM examples? #109

Closed
jrobinson01 opened this issue Dec 30, 2017 · 44 comments
Closed

LSTM examples? #109

jrobinson01 opened this issue Dec 30, 2017 · 44 comments
Assignees
Labels
Milestone

Comments

@jrobinson01
Copy link

jrobinson01 commented Dec 30, 2017

Following on the other issue I created #108 , I'm trying to teach an LSTM network to write a simple children's book. I'm getting odd behavior but really don't know what I'm doing to begin with. I'd love to get this example working and added to the readme for others to follow but am hitting lots of little roadblocks. Here's my code:

const brain = require('brain.js');

const words = new Map();
words.set('00001', 'Jane');
words.set('00010', 'Spot');
words.set('00100', 'Doug');
words.set('01000', 'saw');
words.set('10000', '.');

const trainingData = [
  {input: [0,0,0,0,1], output: [0,1,0,0,0]},// Jane -> saw
  {input: [0,0,0,1,0], output: [0,1,0,0,0]},// Spot -> saw
  {input: [0,0,1,0,0], output: [0,1,0,0,0]},// Doug -> saw
  {input: [0,1,0,0,0], output: [0,0,1,0,0]},// saw -> Doug
  {input: [0,1,0,0,0], output: [0,0,0,1,0]},// saw -> Spot
  {input: [0,1,0,0,0], output: [0,0,0,0,1]},// saw -> Jane
  {input: [0,0,0,1,0], output: [1,0,0,0,0]},// Spot -> .
  {input: [0,0,0,0,1], output: [1,0,0,0,0]},// Jane -> .
  {input: [0,0,1,0,0], output: [1,0,0,0,0]},// Doug -> .
  {input: [1,0,0,0,0], output: [0,0,0,0,1]},// . -> Jane
  {input: [1,0,0,0,0], output: [0,0,0,1,0]},// . -> Spot
  {input: [1,0,0,0,0], output: [0,0,1,0,0]},// . -> Doug
];

const lstm = new brain.recurrent.LSTM();
const result = lstm.train(trainingData);
const run1 = lstm.run([0,0,0,0,1]);// Jane
const run2 = lstm.run(run1);
const run3 = lstm.run(run2);
const run4 = lstm.run(run3);

console.log(words.get('00001'));// start with 'Jane'
console.log(words.get(run1));// saw
console.log(words.get(run2));// Jane, Doug or Spot
console.log(words.get(run3));// .
console.log(words.get(run4));// Jane, Doug or Spot

console.log('run 1:', run1, typeof run1);
console.log('run 2:', run2, typeof run2);
console.log('run 3:', run3, typeof run3);
console.log('run 4:', run4, typeof run4);

The results in the console:

Jane
.
Jane
.
Jane
run 1: 10000 string
run 2: 00001 string
run 3: 10000 string
run 4: 00001 string

some observations:

  • the output of run is a string. I was expecting an array

It obviously isn't working as I expected. Since I can't find a good example I really have no idea what I'm doing wrong. I'm wondering if I'm training it wrong? I'm starting to think I should be giving it example sequences as input... like {input: "Jane saw Spot", output: "."} but I can't wrap my head around how to express that as valid input.

@robertleeplummerjr
Copy link
Contributor

const trainingData = [
  'Jane saw Doug.',
  'Spot saw himself.',
  'Doug saw Jane.'
];

const lstm = new brain.recurrent.LSTM();
const result = lstm.train(trainingData, { iterations: 1000 });
const run1 = lstm.run('Jane');
const run2 = lstm.run('Spot');
const run3 = lstm.run('Doug');

console.log('run 1: Jane' + run1);
console.log('run 2: Spot' + run2);
console.log('run 3: Doug' + run3);

https://jsfiddle.net/robertleeplummerjr/x3cna8rn/2/

I actually just discovered a bug, using the .toFunction doesn't output the character converter, so we'll need to fix that before v1, but the net seems to run just fine. I got:

run 1: Jane saw Doug.
run 2: Spot saw himself.
run 3: Doug saw Jane.

😎

@robertleeplummerjr
Copy link
Contributor

Notice too, after each name, there are spaces. That isn't magic nor was the net specifically told to put spaces, it is just math. The net just saw them and added them after training.

@robertleeplummerjr
Copy link
Contributor

robertleeplummerjr commented Dec 31, 2017

Now I'm starting to have fun with this: https://jsfiddle.net/robertleeplummerjr/x3cna8rn/4/

run 1: Jane saw Doug.
run 2: Spot saw Doug and Jane looking at each other.
run 3: Doug saw Jane.

@robertleeplummerjr
Copy link
Contributor

lol, this is actually addictive: https://jsfiddle.net/robertleeplummerjr/x3cna8rn/5/

run 1: Jane saw Doug.
run 2: Spot saw Doug and Jane looking at each other.
run 3: Doug saw Jane.
run 4: It was love at first sight, and Spot had a frontrow seat. It was a very special moment for all.

@robertleeplummerjr
Copy link
Contributor

https://twitter.com/robertlplummer/status/947275642365730816

Now we just need a publisher...

@jrobinson01
Copy link
Author

Amazing!! So I was completely wrong about how LSTM's expect their training data it seems? I was thinking the time series was each item in the array of training data, but it seems that it's not. In this case, it's using each character of each item?

This is really fun! https://codepen.io/anon/pen/xpdmdN?editors=1010

I'm seeing some weird behavior sometimes, where instead of text, it gives output like '19,19,19,19...19' instead of words. Not sure if that is the bug you mentioned on the other issue.

So now that I can feed it text, and think I understand how it expects text input, I'm lost as to how to feed it anything else. Suppose we wanted try to predict the weather. Our input data might look like this:

{
   date:'10-12-17',
   temperature: 45,
   humidity: 30
},
{
   date:'10-13-17',
   temperature: 40,
   humidity: 10
},

How would we convert that into something the LSTM can try to make sense of, both as training data, and then once trained, as input?

@robertleeplummerjr
Copy link
Contributor

Actually you are more correct than you think. The net does provide some syntactical sugar in some scenarios. Objects have bit yet been worked out, unless you convert them to string or number.

@robertleeplummerjr
Copy link
Contributor

To be clearer. things like numbers aren't so good with recurrent nets as they have been defined to me and I've seen them implemented. Part of the reason is that every character in the net is the input, so numbers are seen as more like characters (honestly more like buttons that are pushed of a sequence, that is translated to another sequence when predicting). This is borrowed mentality from recurrentjs, and is part of the reason I started working on v2. I want the ability to feed data into a recurrent net, not just "push buttons".

@robertleeplummerjr
Copy link
Contributor

I found the issue related when playing around with the toFunction method here #110. Will be pushing a pr soon, locally I have it fixed.

@robertleeplummerjr
Copy link
Contributor

released: https://www.npmjs.com/package/brain.js

@robertleeplummerjr
Copy link
Contributor

I'd like to use this children's book example in the release of v1 on the brain.js.org homepage. Are you ok with that?

@jrobinson01
Copy link
Author

Ok yeah, that makes sense (the pushed buttons analogy)! That's what I've been observing. The LSTM currently throws an error for example, if you try to run it with a word it's never seen before.

I'd love to see the example in the readme! That was a big motivator for this exercise, to help others make sense of it. I got the idea for the children's book here btw: https://www.youtube.com/watch?v=WCUNPb-5EYI

@robertleeplummerjr
Copy link
Contributor

are you cool with using it as a demo? It is just simple enough and fun enough I feel like it'd be great as a primer into recurrent.

@jrobinson01
Copy link
Author

Yeah go for it! I think it does a good job giving newcomers and idea how the different types of networks work and what types of projects they’re a good fit for.

@robertleeplummerjr
Copy link
Contributor

Fixed the output function: https://jsfiddle.net/robertleeplummerjr/x3cna8rn/10/

@robertleeplummerjr
Copy link
Contributor

published new release, I believe one is v1 ready.

@jrobinson01
Copy link
Author

Awesome! I’m going to close this issue.

@DaveBben
Copy link

DaveBben commented Jan 6, 2018

Hey, how did you ever resolve this problem? I am facing the same situation. I have time a time series with objects that look like this:

{
Date: 'Apr 29, 2013',
Open: 134.44,
High: 147.49,
Low: 134,
Close: 144.54,
Volume: '-',
Cap: 1491160000,
}

My goal was to predict the closing price for a future date. I am trying to implement an LSTM. I am not sure what to send as the input and output, so I tried a couple of things and got different outputs.

First I pushed only the closing price into the training data like so:

const bPrices = bitcoinPrices.reverse();
const trainingData = [];

for (let i = 0; i < 20; i += 1) {
  trainingData.push(bPrices[i].Close);
}

const lstm = new brain.recurrent.LSTM();
const result = lstm.train(trainingData, { iterations: 1000 });
console.log(result);
const run1 = lstm.run([2]);

I then receive the following error: Cannot read property 'runBackpropagate' of undefined

I then tried to use the input as being the index and output as being the price:

for (let i = 0; i < 20; i += 1) {
  trainingData.push({input: [i], output:[bPrices[i].Close]});
}

Now if I run it for an input between 0 and 20 it gives me the same values from the training data, but if I run it for anything out of that scope (22 for example) I get unrecognized character "22"

Finally, I tried pushing them in as strings:

for (let i = 0; i < 20; i += 1) {
  trainingData.push(''+bPrices[i].Close);
}

And my output is a string like: "112.3". However, I am not sure if that's supposed to be the expected value. Option 1 seems like it would make more sense, as to pass the position as input and close as output.

Thanks again! This library is truly awesome!

@robertleeplummerjr
Copy link
Contributor

robertleeplummerjr commented Jan 6, 2018

TL;DR
Let me see if I can get a demo together. Can you supply me with more input data? The recurrent neural networks as they exist aren't very good with numbers, they are better at translation.

TS;DR
The existing recurrent neural network operates like pushing buttons (inputs).

Let that sink in for a minute.

The values that you are training for aren't actually sent into the neural net (I want to change this).
The buttons are defined by training, and is done "by value". So if you have an input value of say "1.025876" that "value" (singular) isn't directly sent into the net but rather the buttons that represent the "values" (plural) of "1", ".", "0", "2", "5", "8", "7", & "6" are ran. That is why LSTM networks are useful in translation, because their inputs are distinct concepts. However, there are many variations that skirt this issue, and it is something that I want to do as well, because in the end, this is all just math. There are many variations of recurrent neural networks, and I'd love to see it implemented so you can just feed it data directly.

Part of my frustration with javascript neural networks is that they are "just hobby architectures", for learning.... As in "don't take it seriously".... Please.

When I implemented the recurrent neural network from recurrent.js, I took a step back and thought "how can I fix this for javascript?", for it to not be a hobby, but for it to have that awakening much like when a super hero realizes its powers.

I did a lot of research and arrived at http://gpu.rocks/, but it was not very stable, and not yet released as v1. If you want things done right...

After some initial pull requests, and fantastic conversations with the architects, I was added to the team, where I helped improve the architecture and enabled it to run on node with CPU only fallback, while I was working on brain.js to have an architecture that could use gpu.js. Our plan is to have opencl running in it soon, which is where we go from hobby to enterprise.

Long story short, a few days before the new year, v1 was released.
The architecture for FeedForward is far past planning and is being built and tested. I would like to get the library near 100% test coverage, and am taking small steps to achieve that.

The Recurrent neural network concept is being planned, and will match very closely to the feedforward neural network and will allow for this easy means of altering networks to time series data or translation or math or audio or video, because the architecture is all about resembling very simple math.

So getting back to why this response is so long, I could have started on experimenting with more interesting concepts on how to use LSTM's, and recurrent neural networks, but I chose to build the foundation for making javascript a superhero.

I talked with a fellow developer yesterday and I said the following: "In javascript if you want some serious neural networks to train, it could take a year of training. But if you had the right foundation, you could raise the performance a couple orders of magnitude and calculate those same values in a day. I'd take a year to build that foundation. Why? Because then I could do 4 lifetimes of work in two years."

@robertleeplummerjr
Copy link
Contributor

Thanks again! This library is truly awesome!

TY, that means a lot.

@robertleeplummerjr
Copy link
Contributor

Here may be more what you are looking for to train over time data: https://jsfiddle.net/9t2787k5/4/

This uses https://github.com/wagenaartje/neataptic

@RobertLowe
Copy link

@robertleeplummerjr here's some end-of-data for IBM from 2014-to-2016 in JSON https://gist.github.com/RobertLowe/63642523f227b15c6616a1d89f5b489f (although a less stable stock might be more interesting)

I've only seen LSTM's implemented in coinpusher with neataptic for bitcoin prediction and it's still a learning curve (I'll get there). Would be nice to have a break down of an LSTM/nets core components when applying for numerical data (I think that's what a lot of people want to use nets for, but any door openning would be awesome) 😄 Thanks for being responsive on a non-issue-issue :D

@robertleeplummerjr
Copy link
Contributor

@DaveBben can you take a look at #193. I believe it'll likely be what we help you.

@robertleeplummerjr
Copy link
Contributor

robertleeplummerjr commented Apr 14, 2018

@RobertLowe the api for the time series is now ready for review and built. It strongly reuses the recurrent series, and should be simple enough for anyone who understands lists. Said differently: very low learning curve.

The api is proposed here: #192
The work is committed here: https://github.com/BrainJS/brain.js/tree/192-time-series-baseline

The api:

import brain from 'brain.js';

const net = brain.recurrent.RNNTimeSeries(options);
// or
const net = brain.recurrent.LSTMTimeSeries(options);
// or
const net = brain.recurrent.GRUTimeSeries(options);

net.train([
  [1,2,3,4,5],
  [5,4,3,2,1],
]);

net.run([1,2,3,4]) -> 5
net.run([5,4,3,2]) -> 1

@RobertLowe
Copy link

@robertleeplummerjr wow, this is great! I'll have fun giving it a run with some larger dataset. Thanks for the ping! 👍

@Chidan
Copy link

Chidan commented Jul 10, 2018

HI @robertleeplummerjr ,
thanks for the LSTMTimeSeries, as I understand currently the LSTMTimeSeries gives the next prediction value, but what if I have thousands of training set and thousands of test data set and I would like to predict next 10 or 100 values. How to do this?

@anActualCow
Copy link

anActualCow commented Feb 11, 2019

Hmmm. I'm experiencing a very strange error here.

I have this LSTM network where I'm trying to train it on two numbers and two strings.

I have a pretty large set of data to pull from, and I use one variable, whether or not it's set to one, as the output.

I've modified the data so that it should definitely return the right results, but the results are constantly incorrect. I've increased the number of training iterations, and pulled objects directly from the list, but I still get wrong answers. What's going on?

nvm. increased iterations and it works

@robertleeplummerjr
Copy link
Contributor

Can you share config and or data?

@anActualCow
Copy link

anActualCow commented Feb 22, 2019

Nevermind, I got everything working. Seems pretty accurate when I set hidden layers to [50,50] and provide more data for the network to consume, which is sort of blowing my mind. I'm using it at work in a backend, too. Does this have anything to do with tensorflow?
What are the chances that you'd rewrite something like this in C++ and make javascript bindings for it?

@robertleeplummerjr
Copy link
Contributor

Sharing your example .ay help others, I'd you are up for it.

I'm working on the node GPU counterpart right now, for node (GPU.js). So I'm seeing your c++, and raising you GPU. It will be soon,. 99% of it works, but need multiple render targets for it to cover all our use cases., And will have that in the coming days.

@robertleeplummerjr
Copy link
Contributor

does this have anything to do with tensorflow?

No.

@anActualCow
Copy link

anActualCow commented Feb 22, 2019

Cool, it would be amazing if you could get this working at a very performant level. I'm excited to see it grow and maybe contribute once I actually understand it.

My problem before was this:

var net = new brain.recurrent.LSTM({hiddenLayers: [50, 50]});

iterations: 200,

net.train([
  { input: 'I feel great about the world!', output: 'happy' },
  { input: 'Maybe Im a duck', output: 'mentally ill' },
  { input: 'The world is a terrible place!', output: 'sad' },
],trainOpts); `

instead of putting my inputs and outputs in an object then in an array together, I tried this:

[{input:[{var1: val1}, {var2: val2}], output: {var3: val3}]

Yup. Just set it up completely wrong.

@robertleeplummerjr
Copy link
Contributor

What are the chances that you'd rewrite something like this in C++ and make javascript bindings for it?

The whole idea in brain.js and that we are proving is that:

  1. You don't have to write something complex for it to be useful.
  2. It doesn't have to be rewritten for use elsewhere. JavaScript is simple enough and so composible, we can use it as the main means of writing, and testing. We can then provide a means of execution so it can take full advantage of your hardware.

It is so easy to get caught up in cloning work, to not focus on innovation. We are laying the groundwork to surpase the bigger libraries for innovation because we won't have the tech debt.

@anActualCow
Copy link

anActualCow commented Feb 27, 2019

Any update on the GPU version? Also, why would a GPU shader version be better than rewriting the code in C++?

@robertleeplummerjr
Copy link
Contributor

The underlying engine (GPU.js) is in very active development, has been proven to work in node and will be released soon, which will then spark the completion of v2 of brain.js, including the lstm bits.

@SamanthaClarke1
Copy link

I get really locked down with the current LSTM, mostly because of the gate size.

it keeps repeating simple patterns like, 'the cab the cab the' etc.

Now I'm by no means a RNN genius, so...
from what I understand, the RNN has a certain space that it can look behind whilst it traverses, yes?
Now, what I think is causing my issue is a (relatively) small gate size, right?
But, I cant find anything referencing anything like this in the documentation.

I found clipsize and outputsize in the rnn.defaults, in the codebase.
But changing them doesn't seem to effect anything. Any ideas?

@robertleeplummerjr
Copy link
Contributor

@Samuel-Clarke123 Can you post a script example to build context from?

@SamanthaClarke1
Copy link

// all the good requires and shebangs up here, all standard, dont worry
let MAX_ITERATIONS = 30;
let MAX_SONGS = 5;

function shuffle(a) {
    for (let i = a.length - 1; i > 0; i--) {
        const j = Math.floor(Math.random() * (i + 1));
        [a[i], a[j]] = [a[j], a[i]];
    }
    return a;
}

let trainingData = []
let files = fs.readdirSync(path.join(__dirname, 'songs'));
files = shuffle(files);

files.forEach(function (fname) {
	if(--MAX_SONGS > 0) {
		trainingData.push(fs.readFileSync(path.join(__dirname, 'songs', fname), 'utf-8'));
	}
});

//console.log(trainingData[0])
console.log(trainingData.length + " rap songs loaded");

const net = new brain.recurrent.LSTM({hiddenLayers: [30,30],  activation: 'sigmoid'});

net.train(trainingData, {iterations: MAX_ITERATIONS, errorThresh: 0.072, log: true, logPeriod: 1 });

const tnetJSON = net.toJSON();
fs.writeFileSync('rapgodnet.json', JSON.stringify(tnetJSON));

console.log('done training!!!');

let txmple = trainingData[Math.floor(Math.random()*trainingData.length)]
let txmpleRI = (txmple.length-100)*Math.random();
let seed = txmple.slice(txmpleRI, txmpleRI+100)
console.log("seed: " + seed);
setTimeout(function pred(net, forecast) {
	out = (net.run(forecast));
	console.log(out);
	setTimeout(pred, 1000, net, out);
}, 1000, net, seed);

This is my current method for training my LSTM.
I'm able to get all this working in TensorFlow by manually creating sequence lengths and the such, all that AI goodness, but I cant for the life of me seem to get it working in BrainJS.

Currently each file is approx 30 lines each, about 45 chars longs per line.
Now my sequence length that I can get working with in TF is 100, but I don't know how to control it over here in BrainJS.

So I guess my question about the LSTM is this:
How does BrainJS control the LSTM's sequence length? Is it automatically tuned to the input, or can it be controlled by a parameter?

Thanks for your help 😄

image

@robertleeplummerjr
Copy link
Contributor

What is your tensorflow code to get this working?

@SamanthaClarke1
Copy link

A bunch of long keras and python stuff

relevant model definition

model = Sequential()
model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(256))
model.add(Dropout(0.2))
model.add(Dense(y.shape[1], activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X, y, epochs=25, batch_size=80, callbacks=callbacks_list)

with the sequences being manually controlled by me (ie: X.shape[1])

@SamanthaClarke1
Copy link

and i mean, the output there is a lot more... ahem creative.

image

@taylormcclenny
Copy link

@robertleeplummerjr any development here on @Samuel-Clarke123 issue/question? Thanks!

@robertleeplummerjr
Copy link
Contributor

@robertleeplummerjr any development here on @Samuel-Clarke123 issue/question? Thanks!

Ty for the prodding!

TL;DR
There was a bug before the last release that made the LSTM operate just like the RNN. Things should substantially be better now though not yet running on the GPU, as the bug was resolved in a recent alpha release for v2, and things are continuing to move along. The Roadmap has a v2 Status to help keep everyone up to date.

TS;DR

What we have now:
Node support for the FeedForward class, and the Recurrent class, both passing unit tests very nicely, and I'm currently working on finishing up layers to give us Convolution neural networks (Convolution, Pool, FullyConnected, SoftMax), and was FINALLY able to get to this point from a ton of work finished in GPU.js.

Further reading on why this has taken so long: Javascript for machine learning? Get a real language. If you've followed this thread, you must give that story a read. Also, please tweet it if you feel inclined.

In addition to fixing and finalizing all that came with getting GPU.js working on Node (almost a year of work), two weeks ago, when I was trying to perform code coverage on Brain.js, it would fail in a fireball of death. I had to find a way to jack into istanbul.js (the cove coverage tool for js) to bind code coverage on the CPU mode, and as well a way to strip it out when on GPU mode (Under https://github.com/gpujs/gpu.js#gpu-settings, look at onIstanbulCoverageVariable and removeIstanbulCoverage).

It ended up turning, as you would imagine, into its own (simple) project called istanbul-spy, so others may use it if needed. It turns out, the istanbul.js community met this with some praise.

So now code coverage works, now node support works, now we can get on with our lives and finish this backlog of work, and get v2 out the door!

@taylormcclenny
Copy link

Thanks for the speedy reply @robertleeplummerjr AND all the work!! I'm really enjoying working on NN with JS.

I see your notes on the Convolutional NN, amazing! Any update on that timeline?

Sorry for being all 'gimmie, gimmie, gimmie'..

@mubaidr mubaidr closed this as completed Feb 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants