StepLR
Decays the learning rate of each parameter by every n epoch.
StepLR decays the initial learning rate with some multiplicative factor. The decaying happens every N epochs or every N eval period (in case iteration training is used). This value is set by the user.

Major Parameters

Step Size

The decay of learning rate happens every N epochs. This "N" is the step size.

Gamma

It is the multiplicative factor by which the learning rate is decayed.

Mathematical Demonstration

Let us suppose the step size is set to 30, gamma is 0.1 and the base learning rate is 0.05
for
0<=epoch<300<=epoch<30
,
lr=0.05lr=0.05
for
30<=epoch<6030<=epoch<60
,
lr=0.05โ‹…0.1=0.005lr=0.05 \cdot 0.1=0.005
for
60<=epoch<9060<=epoch<90
,
lr=0.05โ‹…0.12=0.0005lr=0.05 \cdot 0.1^2=0.0005
.. and so on

Code Implementation

1
import torch
2
model = [Parameter(torch.randn(2, 2, requires_grad=True))]
3
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate, weight_decay=0.01, amsgrad=False)
4
scheduler=torch.optim.lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1, last_epoch=-1, verbose=False)
5
for epoch in range(20):
6
for input, target in dataset:
7
optimizer.zero_grad()
8
output = model(input)
9
loss = loss_fn(output, target)
10
loss.backward()
11
optimizer.step()
12
scheduler.step()
Copied!
โ€‹
Last modified 3mo ago