Saba

156 posts

Saba banner
Saba

Saba

@Saba_A96

MSc @Mila_Quebec and @UMontrealDIRO

Katılım Kasım 2020
185 Takip Edilen141 Takipçiler
Saba retweetledi
Kanishk Jain
Kanishk Jain@kanji1011·
Excited to share our new paper: "Discovering Failure Modes of Vision-Language Models using Reinforcement Learning"
English
2
11
50
11.6K
Saba retweetledi
Ahmad Beirami
Ahmad Beirami@abeirami·
- Iran is in a humanitarian crisis. - Thousands are reported dead in 72 hours. - We are past the point of solidarity. Empty words do not stop bullets. Action does. - The world must intervene now.
English
1
249
1.2K
28.1K
Saba retweetledi
J.K. Rowling
J.K. Rowling@jk_rowling·
If you claim to support human rights yet can’t bring yourself to show solidarity with those fighting for their liberty in Iran, you’ve revealed yourself. You don’t give a damn about people being oppressed and brutalised so long as it’s being done by the enemies of your enemies.
J.K. Rowling tweet media
English
19.7K
91.8K
386.9K
15.9M
Saba retweetledi
Siva Reddy
Siva Reddy@sivareddyg·
McGill University (@mcgillu) has many open faculty and postdoctoral positions with generous funding packages, thanks to Impact+ grants, which are investing $2 billion to attract global talent to Canada 🇨🇦🇨🇦🇨🇦. Associate/Full Professor: $8 million startup package Assistant Professor: $600K startup package Postdoc: $70K (starting salary) If you are interested and work in the space of AI/ML/NLP/LLMs, please reach out to me. #AI #NLProc #ML
Siva Reddy tweet mediaSiva Reddy tweet media
English
45
296
1.4K
195K
Saba retweetledi
Yoshua Bengio
Yoshua Bengio@Yoshua_Bengio·
OpenReview is a pillar of progress in the AI research community. Now it needs our support. Along with several of my colleagues, I have pledged to help, and I encourage anyone who can to do the same. openreview.net/donate
English
23
47
355
61K
Saba retweetledi
Oscar Mañas
Oscar Mañas@oscmansan·
📣 Hiring Research Interns for Meta Superintelligence Labs in Zurich! Work on large-scale generative models (image/video gen, multimodal, world models) with real impact on products used by billions. 📍 Zurich | 🕒 6 months | 🎓 PhD students metacareers.com/profile/job_de…
English
4
27
290
23.7K
Saba retweetledi
Hugo Larochelle
Hugo Larochelle@hugo_larochelle·
Really enjoying the smaller scale and more intimate experience of NeurIPS in Mexico City, along with my @Mila_Quebec crew! Including dinners that feature guacamole with crispy larva :-)
Hugo Larochelle tweet mediaHugo Larochelle tweet media
English
4
11
206
21.5K
Saba retweetledi
Imade.
Imade.@ImadeIyamu·
OpenAI's Research Residency Program just opened (Relocation assistance is available) 6-month program designed to identify, mentor, and develop exceptional individuals Compensation: $18,300 per month openai.com/careers/reside…
English
66
296
2.5K
471.9K
Saba retweetledi
Mila - Institut québécois d'IA
Alongside @NeurIPSConf in San Diego, the satellite conference NeurIPS Mexico City is taking place, with several Mila student-researchers taking part. Two of them presented their research today. SaharDastani (@sonia_dt98), PhD student at ETS/Mila, presented “TRUST: Test-Time Refinement using Uncertainty-Guided SSM Traverses” and Saba Ahmadi (@Saba_A96), affiliated researcher at UdeM/Mila, presented “The Promise of RL for Autoregressive Image Editing.” Congratulations!
Mila - Institut québécois d'IA tweet media
English
0
18
55
6.3K
Saba retweetledi
Sai Rajeswar
Sai Rajeswar@RajeswarSai·
At NeurIPS in San Diego from today! great to catch up with old friends, and happy to chat about multimodal reasoning/ interactive RL environments. Some of us from Apriel-15B-Thinker team would be around in person as well for ideas and feedback. Pls come chat with us.. @SathwikTejaswi @sagardavasam @Vikas_NLP_UA
English
0
2
14
868
Saba retweetledi
Arian Hosseini
Arian Hosseini@arianTBD·
Our team at GDM is hiring a Student Researcher (SR) next year 🧠 If you’re a PhD student working on LLMs please apply. I’d love to hear from you. Please fill out this form: forms.gle/bxTEkrDPacn6jS…
English
4
29
281
65.1K
Saba retweetledi
Siva Reddy
Siva Reddy@sivareddyg·
Honored to receive the Computer Science Canada Outstanding Early Career Researcher award 🏅. It is a recognition of the work carried out by my students for their courage to push fundamental ideas in natural language processing even in the era of LLMs. Thanks to my mentors and nominators for making time in their incredibly busy schedule. And thanks to my colleagues at Mila, McGill and ServiceNow for fostering an intellectually stimulating environment and providing resources to succeed!
Mila - Institut québécois d'IA@Mila_Quebec

Congratulations to Siva Reddy (@sivareddyg), Core Academic Member at Mila, who has received the prestigious Outstanding Early Career Computer Science Researcher Award from @CSCan_InfoCan , the leading organization for the computer science community in Canada. mila.quebec/en/news/siva-r…

English
14
11
124
13.9K
Saba retweetledi
Aarash Feizi @ ICLR 🇧🇷
Aarash Feizi @ ICLR 🇧🇷@aarashfeizi·
🚀 Announcing GroundCUA, a high-quality dataset for grounding computer-use agents. With over 3M expert annotations spanning 87 desktop apps, we use our new dataset to train state-of-the-art grounding models, namely GroundNext-3B and GroundNext-7B. 👇 Thread
English
5
31
89
22.3K
Saba retweetledi
Kyle Lo
Kyle Lo@kylelostat·
why intern at Ai2? 🐟interns own major parts of our model development, sometimes even leading whole projects 🐡we're committed to open science & actively help our interns publish their work reach out if u wanna build open language models together 🤝 links👇
Kyle Lo tweet media
English
13
47
697
79.4K
Saba retweetledi
Amirhossein Kazemnejad
Amirhossein Kazemnejad@a_kazemnejad·
After nearly 3 years since our NeurIPS paper, SOTA architectures are now adopting NoPE. Kimi Linear uses NoPE for all full-attention layers (not a RoPE hybrid).
Rohan Paul@rohanpaul_ai

The brilliant Kimi Linear paper. It's a hybrid attention that beats full attention while cutting memory by up to 75% and keeping 1M token decoding up to 6x faster. It cuts the key value cache by up to 75% and delivers up to 6x faster decoding at 1M context. Full attention is slow because it compares every token with every other token and stores all past keys and values. Kimi Linear speeds this up by keeping a small fixed memory per head and updating it step by step like a running summary, so compute and memory stop growing with length. Their new Kimi Delta Attention adds a per channel forget gate, which means each feature can separately decide what to keep and what to fade, so useful details remain and clutter goes away. They also add a tiny corrective update on every step, which nudges the memory toward the right mapping between keys and values instead of just piling on more data. The model stacks 3 of these fast KDA layers then 1 full attention layer, so it still gets occasional global mixing while cutting the key value cache roughly by 75%. Full attention layers run with no positional encoding, and KDA learns order and recency itself, which simplifies the stack and helps at long ranges. Under the hood, a chunkwise algorithm plus a constrained diagonal plus low rank design removes unstable divisions and drops several big matrix multiplies, so the kernels run much faster on GPUs. With the same training setup, it scores higher on common tests, long context retrieval, and math reinforcement learning, while staying fast even at 1M tokens. It drops into existing systems, saves memory, scales to 1M tokens, and improves accuracy without serving changes. ---- Paper – arxiv. org/abs/2510.26692 Paper Title: "Kimi Linear: An Expressive, Efficient Attention Architecture"

English
7
34
366
52K
Saba retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
Stanford just published a huge 470-page study 📕 "The Principles of Diffusion Models" Explains how diffusion models turn noise into data and ties their main ideas together. It starts from a forward process that adds noise over time, then learns the exact reverse. The reverse uses a time dependent velocity field that tells how to move a sample at each step. Sampling becomes solving a time based equation that carries noise to data along a trajectory. There are 3 views of this idea, variational, score-based, and flow-based, and they describe the same thing. There are also 4 training targets, noise, clean data, score, and velocity, and these are equivalent. Shows how guidance can steer outputs using a prompt or label without extra classifiers. Reviews fast solvers that cut steps while keeping quality stable. Explains distillation methods that shrink many sampling steps into a few by mimicking a teacher model. Introduces flow map models that learn direct jumps between times for fast generation from scratch.
Rohan Paul tweet mediaRohan Paul tweet mediaRohan Paul tweet mediaRohan Paul tweet media
English
14
186
1.1K
107.6K
Saba
Saba@Saba_A96·
Delighted to share that my supervisor @aagrawalAA has been awarded the 2025 Mark Everingham Prize, one of the most prestigious honors in the field! Looking forward to seeing her work continue to inspire. 💫🎉
Aishwarya Agrawal@aagrawalAA

I am quite excited to share that our efforts in organizing and running "The VQA series of challenges" have been recognized with the 2025 Mark Everingham Prize -- thecvf.com/?page_id=529 for "stimulating a new strand of vision and language research". Thank you to the PAMI TC committee! I feel quite fortunate to have contributed to the VQA effort. Thank you @deviparikh @DhruvBatra_ for giving me this opportunity and for being awesome mentors! And thanks to my then colleague and now husband @yashgoyal_ for being my pillar of support throughout my research career! Last but not least, thank you @ayshrv for leading the organization of the last 3 VQA challenges! It was fun working with you! The vision-language landscape has evolved quite a bit since we first published our VQA paper! It's exciting to see how vision-language models are mainstream now! Back in those days, this line of work was a bit niche :) So, quite fortunate to have been a part of this exciting development!

English
0
0
17
999
Saba retweetledi
Oscar Mañas
Oscar Mañas@oscmansan·
🌺 Attending @ICCVConference in Honolulu this week! I'll be presenting our work on multimodal reward-guided decoding. Come check it out on October 21 (morning), poster #122. If you’re around, I’d love to connect and chat about multimodal models and real-time video generation!
Oscar Mañas@oscmansan

I’m happy to share that our paper "Controlling Multimodal LLMs via Reward-guided Decoding" has been accepted to #ICCV2025! 🎉 w/ @proceduralia, @koustuvsinha, @adri_romsor, @michal_drozdzal, and @aagrawalAA 🔗 Read more: arxiv.org/abs/2508.11616 🧵 Here's what we did:

English
0
6
21
3K