8.9 Regularization and dropout

Slides

Dropout is a technique that prevents overfitting in neural networks by randomly dropping nodes of a layer during training. As a result, the trained model works as an ensemble model consisting of multiple neural networks.

From previous experiments we got the best values of learning rate 0.01 and layer size of 100. We'll use these values for the next experiment along with different values of dropout rates:

# Function to define model by adding new dense layer and dropout
def make_model(learning_rate=0.01, size_inner=100, droprate=0.5):
    base_model = Xception(weights='imagenet',
                          include_top=False,
                          input_shape=(150,150,3))

    base_model.trainable = False
    
    #########################################
    
    inputs = keras.Input(shape=(150,150,3))
    base = base_model(inputs, training=False)
    vectors = keras.layers.GlobalAveragePooling2D()(base)
    inner = keras.layers.Dense(size_inner, activation='relu')(vectors)
    drop = keras.layers.Dropout(droprate)(inner) # add dropout layer
    outputs = keras.layers.Dense(10)(drop)
    model = keras.Model(inputs, outputs)
    
    #########################################
    
    optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
    loss = keras.losses.CategoricalCrossentropy(from_logits=True)

    # Compile the model
    model.compile(optimizer=optimizer,
                  loss=loss,
                  metrics=['accuracy'])
    
    return model


# Create checkpoint to save best model for version 3
filepath = './xception_v3_{epoch:02d}_{val_accuracy:.3f}.h5'
checkpoint = keras.callbacks.ModelCheckpoint(filepath=filepath,
                                             save_best_only=True,
                                             monitor='val_accuracy',
                                             mode='max')

# Set the best values of learning rate and inner layer size based on previous experiments
learning_rate = 0.001
size = 100

# Dict to store results
scores = {}

# List of dropout rates
droprates = [0.0, 0.2, 0.5, 0.8]

for droprate in droprates:
    print(droprate)
    
    model = make_model(learning_rate=learning_rate,
                       size_inner=size,
                       droprate=droprate)
    
    # Train for longer (epochs=30) cause of dropout regularization
    history = model.fit(train_ds, epochs=30, validation_data=val_ds, callbacks=[checkpoint])
    scores[droprate] = history.history
    
    print()
    print()

Note: Because we introduce dropout in the neural networks, we will need to train our model for longer, hence, number of epochs is set to 30.

Classes, functions, attributes:

tf.keras.layers.Dropout(): dropout layer to randomly sets input units (i.e, nodes) to 0 with a frequency of rate at each epoch during training
rate: argument to set the fraction of the input units to drop, it is a value of float between 0 and 1

Notes

Add notes from the video (PRs are welcome)

A neural network might learn false patterns, i.e. if it repeatedly recognizes a certain logo on a t-shirt it might learn that the logo defines the t-shirt which is wrong since the logo might also be seen on a hoodie.
hiding parts of the images (freeze) from being seen by the learning neural network
dropout = randomly freezing parts of the image
comparing different performance parameters while changing dropout rate and regularization

⚠️	The notes are written by the community. If you see an error here, please create a PR with a fix.

Notes from Peter Ernicke

Navigation

Machine Learning Zoomcamp course
Session 8: Neural Networks and Deep Learning
Previous: Adding more layers
Next: Data augmentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

09-dropout.md

09-dropout.md

8.9 Regularization and dropout

Notes

Navigation

Files

09-dropout.md

Latest commit

History

09-dropout.md

File metadata and controls

8.9 Regularization and dropout

Notes

Navigation