Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added reset() function to sequential model #1908

Closed
wants to merge 3 commits into from

Conversation

fgolemo
Copy link

@fgolemo fgolemo commented Mar 7, 2016

once the model is created and trained, it can be rest like this:

model = Sequential()
model.add(Dense...)
...
model.fit(X, Y)
...
# will reset all weights & biases randomly and recompile the model (necessary)
model = model.reset() 
model.fit(X, Y)

@fchollet:
If this shouldn't be a clean way of doing that, please suggest a different solution.

@fgolemo
Copy link
Author

fgolemo commented Mar 7, 2016

Ideally this pull request fixes #609
I mean there seems to me no way around recompiling after resetting the weights, but here is a convenient function to do that in one go.

@pasky
Copy link
Contributor

pasky commented Mar 7, 2016

Hmm, but how is this different from just re-calling compile() if you are recompiling anyway?

@fgolemo
Copy link
Author

fgolemo commented Mar 7, 2016

For me compiling didn't reset the weights. I'm not entirely sure why that's the case. I needed both - calling layer.build() for each layer (which does the actual weight randomizing) and model.compile(). Otherwise it was still reusing the old weights.

And to add to that: after rebuilding each layer, compiling would fail, claiming that the layers are ill-specified. So additionally after resetting the layer parameters I had to restore the model structure (layer hyper parameters) from the config. The new model = model.reset() does all that in one go.

But please - if there is any easier / cleaner way of doing that, please completely disregard my changes, and create a better pull request. My solution feels a bit hacky.

@@ -181,6 +181,9 @@ def model_from_config(config, custom_objects={}):
elif model_name == 'Sequential':
model.__class__ = Sequential
model.name = model_name
if reset:
for layer in model.layers:
layer.build()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break if layer is a container.

@fchollet
Copy link
Collaborator

fchollet commented Mar 7, 2016

You shouldn't need to re-compile, because not having to recompile is maybe the single most important aspect of a reset function.

What a reset function should do is actually the equivalent of set_weights (value assignment for all weight tensors), but with weights that would be randomly initialized from the init function of the layer. You think you can do that?

@fgolemo
Copy link
Author

fgolemo commented Mar 22, 2016

@fchollet I tried the following: at the end of model.Sequential.compile() I added a line to store the weights of each layer in a list.
self.stored_weights = [l.get_weights() for l in self.layers]
Then I added a function "reset to stored" into model.Sequential that goes through each of those stored weights and restores them for the corresponding layer like this:

def reset(self, use_stored=False):
    if use_stored:
        print("using soft reset")
        for i in range(len(self.stored_weights)):
            self.layers[i].set_weights(self.stored_weights[i])

Now this works, as in: it resets the weights (tested with a few examples) and it doesn't need to recompile.

BUT. But if I try to actually randomize the inputs again, which is -to my understanding- done in each layer's build function, it doesn't work. For some reason that I don't understand. I mean just calling the layer.build() function of each layer should reset the weights to random values, right (in case there are no extra regularizers, no initial weights and no other special stuff)?

To verify this, I added a 'hard reset' function, that would go through each layer, that I marked as resettable (like Dense, but not Activation). In each of those layers it would call the build function. And I actually looked at the weights before the layer.build() and after and they are completely different. The weights are randomized again and the biases are set to zero in case of the Dense.
But why does this not have any effect on the calculation? How is calling the layer.build() + layer.set_weights(layer.get_weights()) function different then layer.set_weights(previous_weights)?

To see what I mean:

def reset(self, use_stored=False):

    if use_stored:
        print("using soft reset") # this works

        for i in range(len(self.stored_weights)):
            self.layers[i].set_weights(self.stored_weights[i])

    else:
        print("using hard reset") # this doesn't work

        for i in range(len(self.layers)):
            print (self.layers[i].get_weights()[0]) # these are the old weights

            self.layers[i].build()

            print (self.layers[i].get_weights()[0]) # these weights are entirely different from the old weights above                

            self.layers[i].set_weights(self.layers[i].get_weights()) # I tried with and without this line... no difference
            # I mean the get_weights() gives different weights after the build()... how does this not work and the thing above does work?

@fgolemo
Copy link
Author

fgolemo commented Mar 23, 2016

....ahhh, I think I get it. The backend of Keras isn't using the actual objects, but the pointers to specific objects that were used in the function. Therefore the soft reset works because both the old weights and the stored weights point to the same variable, which was used in the loss function. But in case of the hard reset I am creating an entirely new variable in memory that no function is using. I'll try that out.

@fgolemo
Copy link
Author

fgolemo commented Mar 23, 2016

okay, yes, that was the issue.

I solved it for the Dense layer (as an example) like this (not taking into account initial weights if there were any):

def reinit_weights(self):
    input_dim = self.input_shape[1]
    new_weights = self.init((input_dim, self.output_dim), name='{}_W'.format(self.name))

    self.trainable_weights[0].set_value(new_weights.get_value())

This will write the new weights into the same shared Theano variable that the loss function is using.

Only problem: now I have to write this for each layer, because each layer has a different init. I'll do this in the next few days and then submit a new pull request.

@fgolemo fgolemo closed this Mar 23, 2016
@Kielland
Copy link

Kielland commented Nov 8, 2016

@fchollet or @fgolemo, it'd be interesting to hear if there are any plans to finish up a performant way to randomize/reset weights. I'm too novice to try it myself. Regards..

@fgolemo
Copy link
Author

fgolemo commented Nov 9, 2016

@Kielland After @fchollet 's input I created another pull request at #2079
But because it took too long to merge it, it got outdated and now it would need serious effort to merge it again into the main branch (and also adapt it to the functional API, not just the sequential model). I currently don't have the time to do that, but might in a few weeks or latest over Christmas.

@Kielland
Copy link

Kielland commented Nov 9, 2016

@fgolemo , thanks for that. Based on your comments above, I gave it a shot to solve this in my model code, based on your ideas above. It didn't succeed, however, so I've asked the community. Feel free to add any insights :-) Thanks!

@Kielland
Copy link

@fgolemo, just to follow up. I found a practical solution by @jkleint here:
https://gist.github.com/jkleint/eb6dc49c861a1c21b612b568dd188668. Thanks!

@pGit1
Copy link

pGit1 commented Dec 14, 2017

Why is this closed? There is no reset function for graphical models.

Is the solution to place model bulding code within the for loop to ensure weights and optimizer states get re-initialized?

@NEGU93
Copy link

NEGU93 commented Sep 15, 2021

There is a great solution posted here.

You can use tf.keras.models.clone_model which returns a new model with random initialized weights. Exactly what I needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants