Ali Athar

56 posts

Ali Athar

@aliathar94

Applied Scientist at Amazon Prev: Research Scientist at ByteDance, PhD from RWTH Aachen, MSc. from TUM

Katılım Şubat 2010

272 Takip Edilen258 Takipçiler

Ali Athar@aliathar94·18 Tem

🌍 Webpage: ali2500.github.io/vicas-project/ 📜 Arxiv: arxiv.org/abs/2412.09754 🧑‍💻Github: github.com/Ali2500/ViCaS 🤗HuggingFace: huggingface.co/datasets/Ali25… (4/4)

English

Ali Athar@aliathar94·18 Tem

For those of you looking to extend their Video-LLMs with spatial intelligence capability, this dataset is a potential game-changer. ViCaS is the largest, human-annotated video dataset that provides both captions as well as grounded segmentation masks (3/4)

English

Ali Athar@aliathar94·18 Tem

In our CVPR'25 paper, we introduced the ViCaS dataset which contains 20,000+ videos with both detailed video captions, as well as pixel-precise masks for selected objects with phrase-grounding (1/4)

English

421

Ali Athar@aliathar94·17 May

@giffmana Never said it was a perfect solution😅 Although given PyTorch's popularity, the average grad student these days is probably quite familiar with PyTorch API mechanics.

English

Lucas Beyer (bl16)@giffmana·16 May

@aliathar94 And not have any numpy user ever read your code?

English

Lucas Beyer (bl16)@giffmana·14 May

This could have been me. I need to resume my pytorch rants.

miru@miru_why

pytorch transpose vs numpy transpose. baffling

English

269

29.4K

Ali Athar@aliathar94·26 Şub

@CVPR @_vztu Any ETA (even approximate) on when the results will be out?

English

14.5K

#CVPR2026@CVPR·26 Şub

@_vztu 🙋

QME

15.5K

Zhengzhong Tu@_vztu·26 Şub

😟Hey friends, star/reply to this tweet if you're also waiting for @CVPR decisions

GIF

English

181

56.8K

Ali Athar@aliathar94·15 Kas

@AljosaOsep @Pandoro_o I think the the confusion arose because the deadline was written as 15 Nov, 2AM CT (which is where the conf venue is), and people just assumed it was until the end of the day according to Pacific time without thinking about the timezone much.

English

117

Aljosa@AljosaOsep·15 Kas

@Pandoro_o Damn that happened? 😱 I should have teeeted this yday!

English

212

Aljosa@AljosaOsep·14 Kas

For everyone stressed with #CVPR2025 deadline: imagine learning yesterday that the deadline is today, and not Friday (true story).

English

3.9K

Ali Athar retweetledi

Mark@_M_Weber·25 Eyl

I'm happy to share our latest work "MaskBit: Embedding-free Image Generation via Bit Tokens" which tackles class-conditional image generation with sota results! ➡️Project page: weber-mark.github.io/projects/maskb… ➡️preprint: arxiv.org/pdf/2409.16211

English

107

9.6K

Ali Athar retweetledi

Jonathon Luiten@JonathonLuiten·19 Eyl

📣📣 Hiring a PhD-Intern 📣📣 Work with me on Dynamic 3D Gaussians at the Meta Boston office for 6 months in summer 2025! Apply here: metacareers.com/jobs/105497412… + write me your questions / link your most relevant work via email or twitter.

Jonathon Luiten@JonathonLuiten

Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis dynamic3dgaussians.github.io We model the world as a set of 3D Gaussians that move & rotate over time. This extends Gaussian Splatting to dynamic scenes, with accurate novel-view synthesis and dense 3D trajectories.

English

348

46.9K

Ali Athar retweetledi

Karim Knaebel@karimknaebel·19 Eyl

Check out our work on fine-tuning of image-conditional diffusion models for depth and normal estimation. Widely used diffusion models can be improved with single-step inference and task-specific fine-tuning, allowing us to gain better accuracy while being 200x faster!⚡ 🧵(1/6)

English

271

41.2K

István Sárándi@Istvan_Sarandi·11 Tem

It's been a pleasure to work on this with @GerardPonsMoll1 I think we found a really effective formulation for training large-scale strong pose and shape models. Here are some more qualitative results on tough, in-the-wild YouTube dance videos.

Gerard Pons-Moll@GerardPonsMoll1

For 3D pose some use different keypoints, others SMPL and other models. It's a mess! With Neural Localizer Fields, we can choose the output at test time! allowing to train using any. Results are real time and SOTA across the board. arxiv.org/pdf/2407.07532 @Istvan_Sarandi

English

335

40.4K

Ali Athar@aliathar94·12 Tem

@Istvan_Sarandi @GerardPonsMoll1 Really cool stuff @Istvan_Sarandi !

English

122

Ali Athar retweetledi

Idil Esen Zulfikar@idilzulfikar·11 Haz

Happy to share that #PointVOS has been accepted to #CVPR2024 🎉 It was a great collaboration with Sabarinath Mahadevan, Paul Voigtlaender, Bastian Leibe🥳 I will be in Seattle next week to present our poster😊 📜Paper: arxiv.org/pdf/2402.05917 🌐 Website: pointvos.github.io

GIF

English

1.4K

Ali Athar@aliathar94·6 Haz

@eric_brachmann @david_picard @jon_barron @CSProfKGD @dimadamen @taiyasaki I don't think it's obvious to everyone in the by-now-very-large CVPR reviewer pool. This, and other common examples would have to be explicitly discussed in the reviewer guidelines for this motion to have the desired effect.

English

Eric Brachmann@eric_brachmann·6 Haz

@aliathar94 @david_picard @jon_barron @CSProfKGD @dimadamen @taiyasaki I do not think this is much of a grey area in terms of the motion. The case requires a significant effort in re-implementation that might be deemed infeasible.

English

Kosta Derpanis@CSProfKGD·5 Haz

#CVPR2024 motion #1

English

105

33.4K

Ali Athar@aliathar94·6 Haz

@david_picard @jon_barron @CSProfKGD @dimadamen @eric_brachmann @taiyasaki An obvious gray area is if the previous work releases model checkpoints and inference code only. Should authors be expected to write the training code and reproduce the results in order to compare their work to previous work trained in a different setting?

English

David Picard@david_picard·6 Haz

@jon_barron @CSProfKGD @dimadamen @eric_brachmann @taiyasaki That's for sure! But at least it gives a clear tool to ACs to not take into account a recommendation using this argument for the decision, and quote why they did dismiss it.

English

255

Ali Athar@aliathar94·5 Haz

@gabriberton This is already the case for some code-bases I've recently worked with. The image means and stds are saved as buffers with the model checkpoint and applied inside the forward pass.

English

101

Gabriele Berton@gabriberton·3 Haz

I wish CV models took as input non-normalized images and the norm was part of the forward(). It would avoid silent normalization bugs that are hard to detect because we never expect norm to cause bugs (and >90% of the time we use imagenet mean/std).

English

13.8K

Ali Athar@aliathar94·1 Haz

@gabriberton That's a big qualifier IMO. The computation graph will be largely shared if the inputs are applied to an encoder/backbone network, which is often the case. In case it is partially shared, is PyTorch smart enough to release only that part which is not needed by other losses?

English

326

Gabriele Berton@gabriberton·31 May

Limitations: this only works when you have more than one loss (with disentagled computational graphs). Bonus: the more losses you have, the more memory you'll save

English

167

37.1K

Gabriele Berton@gabriberton·31 May

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

English

344

3.2K

841.9K

Ali Athar@aliathar94·13 Nis

@jmie_mirza @SergeBelongie @BelongieLab Congrats Jehanzeb and best of luck for the future!

English

Jehanzeb Mirza@jmie_mirza·10 Nis

09.04.2024 -- I managed to defend my PhD. thesis titled 'Unsupervised Adaptation to Distribution Shifts'. Extremely thankful to Prof. Horst Bischof and Prof. @SergeBelongie for making the trip to Graz and agreeing to serve on the committee. @BelongieLab

English

744

Ali Athar@aliathar94·23 Mar

As one journey ends, another beings! For the next phase in life, I've moved to the Bay Area and taken up a Research Scientist position at @BytedanceTalk where I'll continue to work on exciting research problems related to video understanding.

English

192

Ali Athar@aliathar94·23 Mar

Successfully defended my PhD at the @RWTHVisionLab! I'm thankful to my supervisor, colleagues, family members and to God for this incredible 5-year experience! Aside from the professional/research experience, I'll cherish the personal bonds I made here for a long time to come.

English

1.4K

Keşfet

@giffmana @CVPR @_vztu @AljosaOsep @Pandoro_o @GerardPonsMoll1 @Istvan_Sarandi @eric_brachmann