SAR configuration

class Config

General configuration for the SAR library.

disable_sr : bool

Disables sequential re-materialization of the computational graph during the backward pass. The computational graph is constructed normally during the forward pass. default : False

max_collective_size : int

Limits the maximum size of data in torch.distributed.all_to_all collective calls. If non-zero, the sar.comms.all_to_all wrapper method will break down the collective call into multiple torch.distributed.all_to_all calls so that the size of the data in each call is below max_collective_size. default : 0

pipeline_depth : int

Sets the communication pipeline depth when doing sequential aggregation or sequential re-materialization. In a separate thread, SAR will pre-fetch up to pipeline_depth remote partitions into a data queue that will then be processed by the compute thread. Higher values will increase memory consumption but may hide communication latency. default : 1