Dropout is a technique that prevents overfitting in neural networks by randomly dropping nodes of a layer during training. As a result, the trained model works as an ensemble model consisting of multiple neural networks.
From previous experiments we got the best values of learning rate 0.01
and layer size of 100
. We'll use these values for the next experiment along with different values of dropout rates:
# Function to define model by adding new dense layer and dropout
def make_model(learning_rate=0.01, size_inner=100, droprate=0.5):
base_model = Xception(weights='imagenet',
include_top=False,
input_shape=(150,150,3))
base_model.trainable = False
#########################################
inputs = keras.Input(shape=(150,150,3))
base = base_model(inputs, training=False)
vectors = keras.layers.GlobalAveragePooling2D()(base)
inner = keras.layers.Dense(size_inner, activation='relu')(vectors)
drop = keras.layers.Dropout(droprate)(inner) # add dropout layer
outputs = keras.layers.Dense(10)(drop)
model = keras.Model(inputs, outputs)
#########################################
optimizer = keras.optimizers.Adam(learning_rate=learning_rate)
loss = keras.losses.CategoricalCrossentropy(from_logits=True)
# Compile the model
model.compile(optimizer=optimizer,
loss=loss,
metrics=['accuracy'])
return model
# Create checkpoint to save best model for version 3
filepath = './xception_v3_{epoch:02d}_{val_accuracy:.3f}.h5'
checkpoint = keras.callbacks.ModelCheckpoint(filepath=filepath,
save_best_only=True,
monitor='val_accuracy',
mode='max')
# Set the best values of learning rate and inner layer size based on previous experiments
learning_rate = 0.001
size = 100
# Dict to store results
scores = {}
# List of dropout rates
droprates = [0.0, 0.2, 0.5, 0.8]
for droprate in droprates:
print(droprate)
model = make_model(learning_rate=learning_rate,
size_inner=size,
droprate=droprate)
# Train for longer (epochs=30) cause of dropout regularization
history = model.fit(train_ds, epochs=30, validation_data=val_ds, callbacks=[checkpoint])
scores[droprate] = history.history
print()
print()
Note: Because we introduce dropout in the neural networks, we will need to train our model for longer, hence, number of epochs is set to 30
.
Classes, functions, attributes:
tf.keras.layers.Dropout()
: dropout layer to randomly sets input units (i.e, nodes) to 0 with a frequency of rate at each epoch during trainingrate
: argument to set the fraction of the input units to drop, it is a value of float between 0 and 1
Add notes from the video (PRs are welcome)
- A neural network might learn false patterns, i.e. if it repeatedly recognizes a certain logo on a t-shirt it might learn that the logo defines the t-shirt which is wrong since the logo might also be seen on a hoodie.
- hiding parts of the images (freeze) from being seen by the learning neural network
- dropout = randomly freezing parts of the image
- comparing different performance parameters while changing dropout rate and regularization
The notes are written by the community. If you see an error here, please create a PR with a fix. |