Gradient Accumulation with Dual (optimizer, scheduler) Training #14999
Unanswered
celsofranssa
asked this question in
code help: NLP / ASR / TTS
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello, Lightning community,
I am using a dual (optimizer, scheduler) training as shown in the code snippet below:
With
"frequency": 1on both optimizers, the trainer callsoptimizer_1in stepiwhile calling optimizer_2 in step(i+1).Therefore, is there an approach to combine
gradient acccumulationwith this optimization setup whereoptimizer_1uses the accumulated gradient from steps(i-1)andiwhileoptimizer_2uses the accumulated gradient from stepsiand(i+ 1)?Beta Was this translation helpful? Give feedback.
All reactions