initialize_comms(_rank: int, _world_size: int, master_ip_address: str, backend: str, _comm_device: device | None = None, master_port_number: int = 12345)

Initialize Pytorch’s communication library

  • _rank (int) – Rank of the current worker

  • _world_size (int) – Number of workers. The same as the number of graph partitions

  • master_ip_address (str) – IP address of the master worker (worker with rank 0)

  • backend (str) – Backend to use. Can be ccl, nccl, mpi or gloo

  • _comm_device (int) – The device on which the tensors should be on in order to transmit them through the backend. If not provided, the device is infered based on the backend type

  • master_port_number – The port number on the master