A.G

236 posts

A.G banner
A.G

A.G

@AyGhriTweets

ML Research https://t.co/klG0EgFJSm

Boulder, CO Katılım Ocak 2026
33 Takip Edilen9 Takipçiler
A.G retweetledi
can
can@can·
your db query is slow? just add this! boom, now you are ai-native instead of lazy!
can tweet media
English
80
669
12.1K
314.8K
A.G
A.G@AyGhriTweets·
@bremen79 Consider zero shot pruning or quantization of NN: the objective is of the form tr( (W-Z)^T H (W-Z))... so the convexity is close, but with non-convex constraints (binary entries in W or projection on discrete values). W is usually a large matrix, so online approach might help
English
0
0
1
19
Francesco Orabona
Francesco Orabona@bremen79·
My personal view is that this result is mainly interesting from a theoretical standpoint. To be fair, I feel the same about much of the current research on optimization for deep learning: the gap between the assumptions and the practical problem is often too large to draw any real conclusion. The better approach would be to first construct a model that captures the structure of the real problem. Only then does solving that model become practically useful. It strikes me as designing an airplane assuming it to be spherical: It might still result in interesting theory, but unlikely to be practical.
English
1
0
3
62
Francesco Orabona
Francesco Orabona@bremen79·
New blog post: From Online Learning to Non-convex Non-smooth optimization This is the last post in my series to show that online learning is more than just online learning. This reduction is surprising, but the proof is simple 🙂 Feedback is welcome! parameterfree.com/2026/04/06/fro…
English
4
18
107
14.7K
A.G
A.G@AyGhriTweets·
Now the Arab peninsula countries are gonna become poker chips: tossed around all day and all week while gaining nothing.
English
0
0
0
3
Will Schryver
Will Schryver@imetatronink·
🤔🌮 Color me dubious. This smells like another TACO.
Will Schryver tweet mediaWill Schryver tweet media
English
47
97
659
18.6K
A.G
A.G@AyGhriTweets·
At this point, the most unlikely outcome on @Polymarket is the most likely one in US politics.
English
0
0
0
6
Stuart Hameroff
Stuart Hameroff@StuartHameroff·
Anirban’s new book perused by ⁦@JosephJacks_⁩ ‘Sillicon’ apparent typo (for ‘Sillycon’), will revert to Silicon. Over 100 hand drawn illustrations by Anirban
English
7
8
77
7.1K
A.G
A.G@AyGhriTweets·
@JamesTate121 Abu Musa narrated: "I visited the Prophet ﷺ with two men. One of them said, 'O Messenger of Allah, appoint us to a position of leadership,' and the other followed suit. The Prophet replied: 'Truly, we do not entrust this office to those who seek it, nor to those who crave it.'"
English
0
0
4
479
James Tate
James Tate@JamesTate121·
Like Douglas Adams said, "Anyone who wants the job is automatically ineligible for it".
James Tate tweet media
English
27
585
4K
37.3K
A.G
A.G@AyGhriTweets·
@bozavlado I got that. in this case (attention), the sqrt(d) scaling can change the function you target from convex to non-convex. It's doesn't only affect the conditioning of the problem, but its convexity as well
English
1
0
0
40
Vlado Boza
Vlado Boza@bozavlado·
@AyGhriTweets It should not matter whether you do QK/sqrt(d) or do QK with Q and K having much smaller init. The training dynamics should be the same, but with current optimizers they are not. And better optimizer should pull you out of sloppy init too.
English
1
0
0
43
A.G
A.G@AyGhriTweets·
@wildiris19 @amahury0 The point I'm trying to make is that when dealing when highly technical and computational claims, one needs to formulate concepts and steps rigorously. Handwavy arguments lead to nowhere other than an endless loop of unsubstantiated claims.
English
0
0
1
62
A.G
A.G@AyGhriTweets·
@wildiris19 @amahury0 I see many leaps. let's start with basics: laws of physics are purely descriptive and don't explain anything. Have you seen a "proof" of law of gravity? We know it correlates perfectly with observed phenomenon (so far) but that explains nothing.Subsequent claims fail accordingly.
English
1
0
1
82
A.G retweetledi
nostalgia
nostalgia@nostalgicfile·
Never Skip Thai Ads 😭
English
2
22
131
10K
A.G
A.G@AyGhriTweets·
@wildiris19 @amahury0 "The computational theory of finite state automata is all you need to do everything that any living system can do." Is there a proof for this?
English
1
0
1
598
wildiris
wildiris@wildiris19·
No, I meant Turing complete. I’m completely frustrated with constant references to “Turing completeness.” You are correct, no finite system can be Turing complete. Which means it’s a completely useless metric when approaching any finite computational system; whether biological or digital. I only wish people would just stop using the word and move on. Every computational system that can be physically built will be nothing more than finite state machines, random-access memory elements, and sufficient combinational logic to glue it all together. The computational theory of finite state automata is all you need to do everything that any living system can do. Sigh!
English
1
0
9
17.5K
A.G retweetledi
Tochka__26
Tochka__26@Tochka__26·
Me trying to ignore the negativity
English
55
2.3K
19.3K
911.9K