Kawactus

189 posts

Kawactus banner
Kawactus

Kawactus

@ho_andrew

ex PyTorch. Currently in SF, previously in Toronto and Vancouver.

San Francisco, CA Katılım Şubat 2011
215 Takip Edilen156 Takipçiler
Kawactus
Kawactus@ho_andrew·
Really great to see folks like @keunwoochoi experimenting with torchdata.nodes! Looking forward to hearing more feedback from the community and improving on what we're building, cc @_scottas
Keunwoo Choi@keunwoochoi

doing any data loading in Torch? i wrote a beginner's guide (as a beginner) to Torchdata 0.10.1, the first version with data Nodes. (cc' @ho_andrew) i already started using it. it's great! this is also @PrescientDesign's first blog post :D @prescientdesign/tutorial-scalable-and-modular-data-loading-2025-edition-with-torchdata-0-10-1-35ef774d09ba" target="_blank" rel="nofollow noopener">medium.com/@prescientdesi

English
1
0
2
254
Kawactus
Kawactus@ho_andrew·
The company with the largest high-quality, long-form video dataset in the world came out with the best video gen model, who would’ve guessed it
English
0
0
0
78
Kawactus
Kawactus@ho_andrew·
We just introduced Multi-Threading (tested with #NoGIL Python) and Multi-Dataset support to #pytorch dataloading! Check out torchdata.nodes, a beta-release of our new stateful, extensible, and composable dataloading library for #PyTorch! github.com/pytorch/data/r….
English
2
2
7
421
Ville
Ville@ville_ka·
@marksaroufim @soumithchintala Dataloader state saving/loading by having dataloader, datasets and samplers conform to the state_dict/checkpointable interface. This would bring harmony with nn.module. We do this in internal implementations to support granular check pointing of trainer state.
English
1
0
2
156
Mark Saroufim
Mark Saroufim@marksaroufim·
If you could change one thing about PyTorch what would it be?
English
46
16
249
118.2K
Kawactus
Kawactus@ho_andrew·
@clashluke @cloneofsimo Ah I see what you mean by shared memory now. What you've described sounds roughly correct: since the data is already a tensor in shared memory, sending it over the queue this way would probably work fine
English
0
0
1
77
Lucas Nestler
Lucas Nestler@Clashluke·
@ho_andrew @cloneofsimo Right, I explicitly set the batch size to 1 and generated more samples in the dummy dataset, where the "dataset" was just tensor[index] for a tensor that resided in shared memory. (or `indices` if batches)
English
1
0
0
123
Simo Ryu
Simo Ryu@cloneofsimo·
I wrote a custom c++ dataloader and im getting 2000 images/sec with 4 threads Im very frustrated to see my reference pytorch implementation so slow, to a point i feel like something is straight up wrong with my implementation which ive been doing for past 5 years is there hidden config for torch.data.DataLoader that im missing that can fill this gap?
Simo Ryu tweet mediaSimo Ryu tweet mediaSimo Ryu tweet media
English
42
51
960
189.9K
Kawactus
Kawactus@ho_andrew·
@clashluke @cloneofsimo whoops this screenshot above had incorrect outputs due to my notebook shenanigans, this looks better:
Kawactus tweet media
English
0
0
1
194
Kawactus
Kawactus@ho_andrew·
@clashluke @cloneofsimo torch.utils.DataLoader hasn't changed in some time now... hard to say where speedup is from without seeing the whole setup, but x[0] will reduce the amount of data being sent over the queue. Lambda's are finicky but _do_ work with multiprocessing under "fork" context, eg
Kawactus tweet media
English
2
0
2
367
Kawactus
Kawactus@ho_andrew·
@rohitgUCF @clashluke @cloneofsimo If you’re using torch.utils.data.DataLoader with num_workers > 0, torch.multiprocessing automatically copies tensors to shared mem before batches get sent from worker to main proc. This is done to get around python/MP, but NoGIL Python could help remove extra copies!
English
0
0
1
36
Kawactus
Kawactus@ho_andrew·
@clashluke @cloneofsimo This is almost correct: worker processes generate batches, run collate_fn, and _then_ send batches to the main process over mp.queue. The reason this is helpful in most cases is that the default collate function will concatenate/stack tensors, reducing the pickle overhead
English
1
0
5
1.1K
Lucas Nestler
Lucas Nestler@Clashluke·
torch.data does a lot of stuff that isn't ideal for speed but is necessary for better compatibility. For example, it has workers generate ONE image, which they pickle, send to the main process through a pipe. The main worker unpickles them individually, puts them into a buffer, and concatenates the individual samples into batches as needed. One simple speedup is to generate batches in each worker and use `lambda x: x[0]` as collate_fn. Another speedup is to use shared memory where possible. Using a pure Python dataloader you can get crazy speeds. I'm not bottlenecked, even in my video model.
English
9
5
143
18.3K
Kawactus
Kawactus@ho_andrew·
@cloneofsimo @cloneofsimo hi! I work on the PyTorch Dataloader and would be curious to learn more about your torch dataloader and hardware setup. On relatively modest machines I’ve seen ~1-2k img/sec for imagnet on disk. Also heads up: we are retooling the dataloader to leverage NoGIL/FTPy
English
0
0
12
1.5K
Kawactus
Kawactus@ho_andrew·
Very excited about our work on torchdata's StatefulDataLoader being made available to @huggingface Datasets and Accelerate users through the efforts of @TheZachMueller and @qlhoest!
Zach Mueller@TheZachMueller

In yet another @PyTorch x @huggingface collaboration, we're ecstatic to announce support for torchdata's new StatefulDataLoader support. With this, you can now instantaneously continue iterating through your dataloader when resuming training/checkpointing. Especially valuable for streaming datasets. To enable it, set `use_stateful_dataloader=True`. Check out more about it here: #stateful-dataloader" target="_blank" rel="nofollow noopener">github.com/pytorch/data?t…

English
1
1
5
290
Kawactus retweetledi
Soumith Chintala
Soumith Chintala@soumithchintala·
No More GIL! the Python team has officially accepted the proposal. Congrats @colesbury on his multi-year brilliant effort to remove the GIL, and a heartfelt thanks to the Python Steering Council and Core team for a thoughtful plan to make this a reality. discuss.python.org/t/a-steering-c…
English
59
1K
4.5K
1.4M
Kawactus
Kawactus@ho_andrew·
In the name of the father, son, and holey spirit
English
0
0
0
134