Adv. PyTorch: Modifying the Last Layer
All the pre-trained models provided in the torchvision
package in PyTorch are
trained on the ImageNet dataset and can be used out
of the box on this dataset. But often times you want to use these models on
other available image datasets or even your own custom dataset. This usually
requires modifying and fine-tuning the model to work with the new dataset.
Changing the output dimension of the last layer in the model is usually among
the first changes you need to make, and that’s the focus of this post.
Let’s start with loading a pre-trained model from the torchvision
package. We
use the VGG16 model, pretrained on the
ImageNet dataset with 1000 object categories. Let’s take a look at the modules
on this model:
import torch
import torch.nn as nn
import torchvision.models as models
vgg16 = models.vgg16(pretrained=True)
print(vgg16._modules.keys())
odict_keys(['features', 'avgpool', 'classifier'])
We are only interested in the last layer, so let’s print the layers in the ‘classifier’ module:
print(vgg16._modules['classifier'])
Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
As expected, the output dimension for the last layer is 1000. Let’s assume we
are going to use this model on the COCO dataset
with 80 object categories. To change the output dimension of the model to 80,
we simply replace the last sub-layer with a new Linear layer. The Linear layer
takes two required arguments: in_features
and out_features
. The in_features
is going to be the same as before, and out_features
is goint to be 80:
in_features = vgg16._modules['classifier'][-1].in_features
out_features = 80
vgg16._modules['classifier'][-1] = nn.Linear(in_features, out_features, bias=True)
print(vgg16._modules['classifier'])
Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=80, bias=True)
)
That’s it! The output dimension is now 80. You need to keep in mind that by replacing the last layer we removed any learned parameter in this layer. You need to finetune the model on the new dataset at this point to learn the parameters again.