How to have different learning rates for the parts of a network in Pytorch?
I've got a segmentation network, which contains an encoder and multiple decoders. I want to have different learning rates for each part. How can I make it happen?
I've got a segmentation network, which contains an encoder and multiple decoders. I want to have different learning rates for each part. How can I make it happen?
First, you need to define those parts in different classes to have different names. After that, you just filter and collect those parts by their layer-names.
encoder = []
decoder = []
for name, param in net.named_parameters():
if 'encoder' in name:
encoder.append(param)
elif 'decoder' in name:
decoder.append(param)
Then you pass those params to the optimizer separately, where you can treat them separately too.
optimizer = torch.optim.SGD([{'params':encoder}, {'params':decoder}], lr=DEFOULT_LRATE, nesterov=True)
...
optimizer.param_groups[0]['lr'] = L_RATE1
optimizer.param_groups[1]['lr'] = L_RATE2