sar.gather_grads

gather_grads(model: Module)

Sum the parameter gradients from all workers. This should be called before optimizer.step

Parameters:

model (torch.nn.Module) – The model whose parameter gradients are to be synchronized (summed) across all workers. The model architecture should be the same in all workers.