Iman Hosseini

1.6K posts

Iman Hosseini banner
Iman Hosseini

Iman Hosseini

@iman2_718

https://t.co/EoME10h3rV

London, England Katılım Mayıs 2016
2.1K Takip Edilen472 Takipçiler
Lisan al Gaib
Lisan al Gaib@scaling01·
just read the Titans and Nested Learning paper Is this the figure that killed Hope?
Lisan al Gaib tweet media
English
5
5
98
13.3K
Ali Behrouz
Ali Behrouz@behrouz_ali·
Thank you very much, I was just trying to share that we tried to show both good sides and challenges in the paper, and potentially a model with good continual learning results might not still use its context perfectly. Honestly, we have had multiple follow up works in this direction, like Atlas, Miras, ..., (each tried to target orthogonal aspects/challenges in the models) and hopefully more to share in future😀 Also, I remember that there have been a PR to claude-mem repo for Titans with showing improvements. I think each model has a lot to improve and to build upon, and these models are not exceptions. BTW, LLM aside, there have been some public efforts for adapting Titans for different data modalities and some tasks (e.g., video, EEG, remote sensing, ...), and hopefully more to come. The publication of Hope also is just about 3-4 months old, and there are some papers using it already.
English
1
0
17
545
Iman Hosseini
Iman Hosseini@iman2_718·
A while ago I was looking at a problem and thought man I wish I could ask @jonmasters. Then I realized I literally can. We are colleagues :)
English
0
0
4
1.3K
Iman Hosseini retweetledi
NetBlocks
NetBlocks@netblocks·
⚠️ Update: #Iran has now been offline for 96 hours, limiting reporting and accountability over civilian deaths as Iranians protest and demand change; fixed-line internet, mobile data and calls are disabled, while other communication means are also increasingly being targeted ⌛️
NetBlocks tweet media
English
288
2.1K
3.7K
340.6K
Egor Konovalov
Egor Konovalov@foldll·
built kernelscope - CUDA kernel debugger that maps source lines to PTX + GPU events runs entirely on @modal btw analyzing warp state, memory coalescing, warp divergence, SM occupancy, hints 4 perf more info&pics in next post @charles_irl @can this is my job application part 2
Egor Konovalov tweet media
English
13
18
351
24.5K
Akshay Kothari
Akshay Kothari@akothari·
@robhoeij Pause stops the conversation; what you really need is a mute button that allows Gemini to keep talking. Very important in a public setting.
English
3
0
38
4K
Akshay Kothari
Akshay Kothari@akothari·
Gemini's deep integration with Google products is quite impressive. For example: I was in Italy last week, and tried a simple query (make me a day trip in Florence from the train station) both in ChatGPT and Gemini last week. The actual plan was somewhat similar, but the Google Maps integration in Gemini was incredibly handy. Take a look at the end of this conversation: gemini.google.com/share/4516b0af… I actually followed that Google maps loop for the rest of the day. And became a daily user, right after that. I found Gemini to be a lot more accurate on travel/local info. Things I still miss from ChatGPT: - speed of q&a (chat feels way snappier) - polish of iOS app (Gemini still has some rough edges) - ability to mute my mic in voice mode (this feels like a surprising miss in Gemini)
English
38
35
716
135.5K
Kuter Dinel
Kuter Dinel@KuterDinel·
Quick life update. Moved to California to work at NVIDA. Oh I have so much to learn
Kuter Dinel tweet media
English
155
62
3.1K
261.7K
Iman Hosseini retweetledi
Ali Behrouz
Ali Behrouz@behrouz_ali·
Excited to announce our work on Nested Learning that also recently accepted to NeurIPS 2025! Stay tuned for the full version on arXiv (in the next few days) and then I'll discuss more details about the intuition behind its design and why we believe it can help with continual learning!
Google Research@GoogleResearch

Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: goo.gle/47LJrzI @GoogleAI

English
24
56
699
90.8K
Chris Lattner
Chris Lattner@clattner_llvm·
Thank you to folks at @metaai for publishing their independent perf analysis comparing CUDA and Mojo against Triton and TileLang DSLs, showing Mojo meeting and beating CUDA, and leaving DSLs in the dust.
Chris Lattner tweet media
English
27
79
691
137.8K
Iman Hosseini retweetledi
Adam Paszke
Adam Paszke@apaszke·
Want to improve GPU compute/comms overlap? We just published a new short tutorial for you! A few small changes to the Pallas:MGPU matmul kernel is all it takes to turn it into an all-gather collective matmul that overlaps NVLINK comms with local compute: docs.jax.dev/en/latest/pall…
English
8
43
301
32.7K
Iman Hosseini retweetledi
Adam Paszke
Adam Paszke@apaszke·
Curious how to write SOTA performance Blackwell matmul kernels using MGPU? We just published a short step-by-step tutorial: docs.jax.dev/en/latest/pall… At each step, we show exactly what (small) changes are necessary to refine the kernel and the final kernel is just under 150 lines.
English
4
67
418
54.4K