Tensorflow keras freeze / unfreeze a specific layer or multiple layers based on layer name.
Freezing a layer retains current weights of a layer and does not alter when the model is trained / during model fitting. Simply put the the frozen layers are not trainable.
Freezing layer is a technique predominantly used for transfer learning and fine-tuning.These are cases in which we wish to retain layer weights; as this could be an already trained model like resnet , mobilenet,etc.
Or we just want to train only certain layers in the current model and not the whole model- we can freeze and unfreeze layers according to our needs and then start the fitting processing.
Creating a new tensorflow model using functional api - execute the given code to generate the following model.
Generated model is stored in a variable called "model".
# viewing the model info
# pass "show_trainable=True"
# to see whether a layer is trainable
model.summary(show_trainable=True)
Model: "model_1" ____________________________________________________________________________ Layer (type) Output Shape Param # Trainable ============================================================================ input_layer (InputLayer) [(None, 3)] 0 Y dense_0 (Dense) (None, 8) 32 Y dense_1 (Dense) (None, 8) 72 Y dense_2 (Dense) (None, 8) 72 Y output_layer (Dense) (None, 1) 9 Y ============================================================================ Total params: 185 (740.00 Byte) Trainable params: 185 (740.00 Byte) Non-trainable params: 0 (0.00 Byte) ____________________________________________________________________________
The created model has one input layer , 3 dense layers and one output layer. Each layer contains a "trainable" parameter to which we can set boolean value to make a layer trainable or freeze it.If trainable is set to True then the layer weights are updated during the model training process. All layers are intialized with trainable as True.
We can freeze a layer to retain the weights and prevent it from updating during the training process- by simply setting the trainable parameter as False. The model summary will also display the total trainable and non-trainable parameters.
Freeze a layer if current layer name exactly matches the string. If layer name is "dense_0" then it will be frozen - that is trainable will be set to False.Weights are not updated during the training process ; non trainable.
for layer in model.layers:
# freeze layer if current layer name is same as string
if layer.name == "dense_0":
layer.trainable = False
model.summary(show_trainable=True)
Model: "model_1" ____________________________________________________________________________ Layer (type) Output Shape Param # Trainable ============================================================================ input_layer (InputLayer) [(None, 3)] 0 Y dense_0 (Dense) (None, 8) 32 N dense_1 (Dense) (None, 8) 72 Y dense_2 (Dense) (None, 8) 72 Y output_layer (Dense) (None, 1) 9 Y ============================================================================ Total params: 185 (740.00 Byte) Trainable params: 153 (612.00 Byte) Non-trainable params: 32 (128.00 Byte) ____________________________________________________________________________
Unfreeze a layer by setting the trainable parameter to True.Here, if current layer name exactly matches the string "dense_0" then it will be unfrozen - that is trainable will be set to True.Weights of this layer will be updated during the training process.
for layer in model.layers:
# unfreeze layer if current layer name is same as string
if layer.name == "dense_0":
layer.trainable = True
model.summary(show_trainable=True)
Model: "model_1" ____________________________________________________________________________ Layer (type) Output Shape Param # Trainable ============================================================================ input_layer (InputLayer) [(None, 3)] 0 Y dense_0 (Dense) (None, 8) 32 Y dense_1 (Dense) (None, 8) 72 Y dense_2 (Dense) (None, 8) 72 Y output_layer (Dense) (None, 1) 9 Y ============================================================================ Total params: 185 (740.00 Byte) Trainable params: 185 (740.00 Byte) Non-trainable params: 0 (0.00 Byte) ____________________________________________________________________________
Freeze a layer if the layer name ends with specific text string.Here, if layer name ends with "_1" then that layer will be freezed and will not updated during the training process.Only "dense_1" layer matches the criteria , only this layer will frozen.
for layer in model.layers:
# freeze layers if current layer name ends with "_1"
if layer.name.endswith("_1"):
layer.trainable = False
model.summary(show_trainable=True)
Model: "model_1" ____________________________________________________________________________ Layer (type) Output Shape Param # Trainable ============================================================================ input_layer (InputLayer) [(None, 3)] 0 Y dense_0 (Dense) (None, 8) 32 Y dense_1 (Dense) (None, 8) 72 N dense_2 (Dense) (None, 8) 72 Y output_layer (Dense) (None, 1) 9 Y ============================================================================ Total params: 185 (740.00 Byte) Trainable params: 113 (452.00 Byte) Non-trainable params: 72 (288.00 Byte) ____________________________________________________________________________
We can also use other string methods like "startswith" and freeze layers if a layer name starts with "dense_".Here,three layers will match criteria and frozen.
for layer in model.layers:
# freeze layers if current layer name starts with "dense_"
if layer.name.startswith("dense_"):
layer.trainable = False
model.summary(show_trainable=True)
Model: "model_1" ____________________________________________________________________________ Layer (type) Output Shape Param # Trainable ============================================================================ input_layer (InputLayer) [(None, 3)] 0 Y dense_0 (Dense) (None, 8) 32 N dense_1 (Dense) (None, 8) 72 N dense_2 (Dense) (None, 8) 72 N output_layer (Dense) (None, 1) 9 Y ============================================================================ Total params: 185 (740.00 Byte) Trainable params: 9 (36.00 Byte) Non-trainable params: 176 (704.00 Byte) ____________________________________________________________________________
We can also create conditional logic which checks if current layer name exists in the list and then freeze / unfreeze layers of the tensorflow model.Here, we have two lists which contains names of three layers. Layer names - "dense_0" and "dense_2" are in unfreeze_layer list so these will become trainable and unfrozen. Freeze_layer list contains "output_layer" ,so output layer will frozen and becomes non-trainable.
unfreeze_layer_list = ["dense_0", "dense_2"]
freeze_layer_list = ["output_layer"]
for layer in model.layers:
# unfreeze layers if current layer name exists in unfreeze_layer_list
if layer.name in unfreeze_layer_list:
layer.trainable = True
# freeze layers if current layer name exists in freeze_layer_list
elif layer.name in freeze_layer_list:
layer.trainable = False
model.summary(show_trainable=True)
Model: "model_1" ____________________________________________________________________________ Layer (type) Output Shape Param # Trainable ============================================================================ input_layer (InputLayer) [(None, 3)] 0 Y dense_0 (Dense) (None, 8) 32 Y dense_1 (Dense) (None, 8) 72 N dense_2 (Dense) (None, 8) 72 Y output_layer (Dense) (None, 1) 9 N ============================================================================ Total params: 185 (740.00 Byte) Trainable params: 104 (416.00 Byte) Non-trainable params: 81 (324.00 Byte) ____________________________________________________________________________
Make sure to compile the model after freezing or unfreezing layers.
# compiling the model with required options
model.compile()
Full code to create a tensorflow model - with code freeze or unfreeze layers.
# importing tensorflow
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input
# function to create dense layers
def dense_layer_creator(units, name_for_layer, last_layer, activation=""):
if not activation == "":
# if an activation value is provided
# return a denser layer with activation
result_output_layer = Dense(
units=units, name=name_for_layer, activation=activation
)(last_layer)
return result_output_layer
elif activation == "":
# if no activation is passed
# return a dense layer without activation
result_output_layer = Dense(
units=units,
name=name_for_layer,
)(last_layer)
return result_output_layer
# defining the input shape
# this would depend on "X" data shape
input_shape = 3
# defining the count of dense layers
dense_layer_count = 3
# function to create a new model
# uses the functional api of tensorflow
# this function creates a single out regression model
def build_model():
# defining and creating a input layer
input_layer = Input(shape=input_shape, name="input_layer")
previous_layer = input_layer
dense_layers = ""
# creating multiple dense layers using loop
for i in range(dense_layer_count):
if not i == 0:
previous_layer = dense_layers
dense_layers = dense_layer_creator(
units=8,
name_for_layer=f"dense_{i}",
last_layer=previous_layer,
activation="relu",
)
# not passing the activation argument for the last layer
# mostly output layer do not have an activation
# use activation if it is required
output_layer = dense_layer_creator(
units=1, name_for_layer=f"output_layer", last_layer=dense_layers
)
# created model
# this has to be compiled for fitting / training
model = Model(inputs=input_layer, outputs=output_layer)
return model
# generating the model and storing
model = build_model()
# using the gradient descent
# tensorflow comes more optimizers
optimizer = tf.keras.optimizers.SGD(learning_rate=0.001)
loss_strategy = tf.keras.losses.mean_squared_error
metric_watch = tf.keras.metrics.RootMeanSquaredError()
# compiling the model with required options
model.compile(
optimizer=optimizer,
loss=loss_strategy,
metrics=metric_watch,
)
# viewing the model info
# pass "show_trainable=True"
# to see whether a layer is trainable
model.summary(show_trainable=True)
for layer in model.layers:
# freeze layer if current layer name is same as string
if layer.name == "dense_0":
layer.trainable = False
model.summary(show_trainable=True)
for layer in model.layers:
# unfreeze layer if current layer name is same as string
if layer.name == "dense_0":
layer.trainable = True
model.summary(show_trainable=True)
for layer in model.layers:
# freeze layers if current layer name ends with "_1"
if layer.name.endswith("_1"):
layer.trainable = False
model.summary(show_trainable=True)
for layer in model.layers:
# freeze layers if current layer name starts with "dense_"
if layer.name.startswith("dense_"):
layer.trainable = False
model.summary(show_trainable=True)
unfreeze_layer_list = ["dense_0", "dense_2"]
freeze_layer_list = ["output_layer"]
for layer in model.layers:
# unfreeze layers if current layer name exists in unfreeze_layer_list
if layer.name in unfreeze_layer_list:
layer.trainable = True
# freeze layers if current layer name exists in freeze_layer_list
elif layer.name in freeze_layer_list:
layer.trainable = False
model.summary(show_trainable=True)
Comments
Post a Comment
Oof!