How to compute per sample gradient in deepspeed #7491
Unanswered
vermouthdky
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there, I was wondering how can I do per sample gradient computing in deepspeed for multi-gpu training?
It seems like
model_engine.backward()
automatically do all_reduce, and usedeepspeed.utils.safe_get_full_grad()
will get the grads after gradient synchronisation. But how can I get the local grads per sample after backward.Thanks in advance for the kind help!
Beta Was this translation helpful? Give feedback.
All reactions