added reset() function to sequential model #1908

fgolemo · 2016-03-07T11:56:52Z

once the model is created and trained, it can be rest like this:

model = Sequential()
model.add(Dense...)
...
model.fit(X, Y)
...
# will reset all weights & biases randomly and recompile the model (necessary)
model = model.reset() 
model.fit(X, Y)

@fchollet:
If this shouldn't be a clean way of doing that, please suggest a different solution.

fgolemo · 2016-03-07T12:03:26Z

Ideally this pull request fixes #609
I mean there seems to me no way around recompiling after resetting the weights, but here is a convenient function to do that in one go.

pasky · 2016-03-07T16:47:05Z

Hmm, but how is this different from just re-calling compile() if you are recompiling anyway?

fgolemo · 2016-03-07T16:49:26Z

For me compiling didn't reset the weights. I'm not entirely sure why that's the case. I needed both - calling layer.build() for each layer (which does the actual weight randomizing) and model.compile(). Otherwise it was still reusing the old weights.

And to add to that: after rebuilding each layer, compiling would fail, claiming that the layers are ill-specified. So additionally after resetting the layer parameters I had to restore the model structure (layer hyper parameters) from the config. The new model = model.reset() does all that in one go.

But please - if there is any easier / cleaner way of doing that, please completely disregard my changes, and create a better pull request. My solution feels a bit hacky.

neggert · 2016-03-07T20:25:16Z

keras/models.py

@@ -181,6 +181,9 @@ def model_from_config(config, custom_objects={}):
    elif model_name == 'Sequential':
        model.__class__ = Sequential
        model.name = model_name
+        if reset:
+            for layer in model.layers:
+                layer.build()


This will break if layer is a container.

fchollet · 2016-03-07T23:24:47Z

You shouldn't need to re-compile, because not having to recompile is maybe the single most important aspect of a reset function.

What a reset function should do is actually the equivalent of set_weights (value assignment for all weight tensors), but with weights that would be randomly initialized from the init function of the layer. You think you can do that?

fgolemo · 2016-03-22T17:04:44Z

@fchollet I tried the following: at the end of model.Sequential.compile() I added a line to store the weights of each layer in a list.
self.stored_weights = [l.get_weights() for l in self.layers]
Then I added a function "reset to stored" into model.Sequential that goes through each of those stored weights and restores them for the corresponding layer like this:

def reset(self, use_stored=False):
    if use_stored:
        print("using soft reset")
        for i in range(len(self.stored_weights)):
            self.layers[i].set_weights(self.stored_weights[i])

Now this works, as in: it resets the weights (tested with a few examples) and it doesn't need to recompile.

BUT. But if I try to actually randomize the inputs again, which is -to my understanding- done in each layer's build function, it doesn't work. For some reason that I don't understand. I mean just calling the layer.build() function of each layer should reset the weights to random values, right (in case there are no extra regularizers, no initial weights and no other special stuff)?

To verify this, I added a 'hard reset' function, that would go through each layer, that I marked as resettable (like Dense, but not Activation). In each of those layers it would call the build function. And I actually looked at the weights before the layer.build() and after and they are completely different. The weights are randomized again and the biases are set to zero in case of the Dense.
But why does this not have any effect on the calculation? How is calling the layer.build() + layer.set_weights(layer.get_weights()) function different then layer.set_weights(previous_weights)?

To see what I mean:

def reset(self, use_stored=False):

    if use_stored:
        print("using soft reset") # this works

        for i in range(len(self.stored_weights)):
            self.layers[i].set_weights(self.stored_weights[i])

    else:
        print("using hard reset") # this doesn't work

        for i in range(len(self.layers)):
            print (self.layers[i].get_weights()[0]) # these are the old weights

            self.layers[i].build()

            print (self.layers[i].get_weights()[0]) # these weights are entirely different from the old weights above                

            self.layers[i].set_weights(self.layers[i].get_weights()) # I tried with and without this line... no difference
            # I mean the get_weights() gives different weights after the build()... how does this not work and the thing above does work?

fgolemo · 2016-03-23T10:47:41Z

....ahhh, I think I get it. The backend of Keras isn't using the actual objects, but the pointers to specific objects that were used in the function. Therefore the soft reset works because both the old weights and the stored weights point to the same variable, which was used in the loss function. But in case of the hard reset I am creating an entirely new variable in memory that no function is using. I'll try that out.

fgolemo · 2016-03-23T13:59:33Z

okay, yes, that was the issue.

I solved it for the Dense layer (as an example) like this (not taking into account initial weights if there were any):

def reinit_weights(self):
    input_dim = self.input_shape[1]
    new_weights = self.init((input_dim, self.output_dim), name='{}_W'.format(self.name))

    self.trainable_weights[0].set_value(new_weights.get_value())

This will write the new weights into the same shared Theano variable that the loss function is using.

Only problem: now I have to write this for each layer, because each layer has a different init. I'll do this in the next few days and then submit a new pull request.

Kielland · 2016-11-08T15:30:45Z

@fchollet or @fgolemo, it'd be interesting to hear if there are any plans to finish up a performant way to randomize/reset weights. I'm too novice to try it myself. Regards..

fgolemo · 2016-11-09T12:58:32Z

@Kielland After @fchollet 's input I created another pull request at #2079
But because it took too long to merge it, it got outdated and now it would need serious effort to merge it again into the main branch (and also adapt it to the functional API, not just the sequential model). I currently don't have the time to do that, but might in a few weeks or latest over Christmas.

Kielland · 2016-11-09T13:07:58Z

@fgolemo , thanks for that. Based on your comments above, I gave it a shot to solve this in my model code, based on your ideas above. It didn't succeed, however, so I've asked the community. Feel free to add any insights :-) Thanks!

Kielland · 2016-11-12T20:46:50Z

@fgolemo, just to follow up. I found a practical solution by @jkleint here:
https://gist.github.com/jkleint/eb6dc49c861a1c21b612b568dd188668. Thanks!

pGit1 · 2017-12-14T14:48:33Z

Why is this closed? There is no reset function for graphical models.

Is the solution to place model bulding code within the for loop to ensure weights and optimizer states get re-initialized?

NEGU93 · 2021-09-15T09:01:09Z

There is a great solution posted here.

You can use tf.keras.models.clone_model which returns a new model with random initialized weights. Exactly what I needed.

fgolemo added 3 commits March 7, 2016 12:45

added new reset() function to sequential model

c5258fd

added tiny part of function doc

342696e

accidentally duplicated a line

290d762

fgolemo mentioned this pull request Mar 7, 2016

Re-init weights to avoid recompile #609

Closed

neggert reviewed Mar 7, 2016
View reviewed changes

fgolemo closed this Mar 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added reset() function to sequential model #1908

added reset() function to sequential model #1908

fgolemo commented Mar 7, 2016

fgolemo commented Mar 7, 2016

pasky commented Mar 7, 2016

fgolemo commented Mar 7, 2016

neggert Mar 7, 2016

fchollet commented Mar 7, 2016

fgolemo commented Mar 22, 2016

fgolemo commented Mar 23, 2016

fgolemo commented Mar 23, 2016

Kielland commented Nov 8, 2016

fgolemo commented Nov 9, 2016

Kielland commented Nov 9, 2016

Kielland commented Nov 12, 2016

pGit1 commented Dec 14, 2017

NEGU93 commented Sep 15, 2021

added reset() function to sequential model #1908

added reset() function to sequential model #1908

Conversation

fgolemo commented Mar 7, 2016

fgolemo commented Mar 7, 2016

pasky commented Mar 7, 2016

fgolemo commented Mar 7, 2016

neggert Mar 7, 2016

Choose a reason for hiding this comment

fchollet commented Mar 7, 2016

fgolemo commented Mar 22, 2016

fgolemo commented Mar 23, 2016

fgolemo commented Mar 23, 2016

Kielland commented Nov 8, 2016

fgolemo commented Nov 9, 2016

Kielland commented Nov 9, 2016

Kielland commented Nov 12, 2016

pGit1 commented Dec 14, 2017

NEGU93 commented Sep 15, 2021