PyTorch distributed dataLoader

Question

Any recommended ways to make PyTorch DataLoader (torch.utils.data.DataLoader) work in distributed environment, single machine and multiple machines? Can it be done without DistributedDataParallel?

Florin · Accepted Answer · 2022-05-07 18:30:28Z

3

Maybe you need to make your question clear. DistributedDataParallel is abbreviated as DDP, you need to train a model with DDP in a distributed environment. This question seems to ask how to arrange the dataset loading process for distributed training.

First of all,

data.Dataloader is proper for both dist and non-dist training, usually, there is no need to do something on that.

But the sampling strategy varies in this two modes, you need to specify a sampler for the dataloader(the sampler arg in data.Dataloader), adopting torch.utils.data.distributed.DistributedSampler is the simplest way.

answered May 7, 2022 at 18:30

Florin

311 bronze badge

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

PyTorch distributed dataLoader

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related