Alex Shtoff

11.6K posts

Alex Shtoff

@AlexShtf

Ph.D. Principal Scientist @ TII. Ex @YahooResearch. I do machine learning ∩ numerical methods ∩ SW dev. Author of https://t.co/MkW8DDKamf

Israel Katılım Ağustos 2012

288 Takip Edilen1.4K Takipçiler

Sabitlenmiş Tweet

Alex Shtoff@AlexShtf·1 Tem

New post in my "Eigenvalues as models" series. The series explores a simple but weird predictive model: build a learned symmetric matrix from the input features, then use one of its eigenvalues as the non-linearity. This time I look at converence speed during training, which we saw in previous posts that can be occasionally slow, and it turns out to be related to the expressiveness of the model. The middle eigenvalue is expressive, but when nearby eigenvalues collide, the eigenvectors can change abruptly. Since eigenvalue gradients come from these eigenvectors, training gets jumpy. Somewhat unexpectedly, a path forward is to smooth the sharp eigenvalue objective first using a tool well-known to the optimization community, but somewhat little known to many machine-learning practitioners - the Moreau envelope. alexshtf.github.io/2026/07/01/Spe…

English

4.2K

Alex Shtoff@AlexShtf·1d

@cremieuxrecueil Or, perhaps, the cause and effect are different - men who earn more tend to get married?

English

Crémieux@cremieuxrecueil·2d

The gender wage gap is mostly about married men doing an incredible job earning more than everybody else.

English

524

1.2K

17.6K

Alex Shtoff@AlexShtf·1d

@Bondisrael_ הייתי אומר שגברים שמתחילים בגיל מוקדם יותר להרוויח יותר הם אלו שמתחתנים, ואלה שנוטים להרוויח פחות נשארים רווקים.

עברית

Bondisrael@Bondisrael_·2d

מפתיע למדי. המגזר שמרוויח הכי הרבה ובפער גדול זה גברים נשואים. לא לריב עם העובדות. אכן. מעניין מאד. יש הסבר?

Crémieux@cremieuxrecueil

The gender wage gap is mostly about married men doing an incredible job earning more than everybody else.

עברית

12.3K

Alex Shtoff@AlexShtf·2d

@gausts_pgs @testinprodcap I think it would be stupid to leave OpenAI to just do more of the same.

English

gausts (s/acc)@gausts_pgs·2d

@AlexShtf @testinprodcap because it would be stupid not to. chances are we get rolling self-attention and another encoder gets bolted on somewhere. why scrap something that almost works?

English

奶奶 capital@testinprodcap·3d

memory is going to 0. they invented log attention.

Ilya Sutskever@ilyasut

Time to scale that SSI:

English

1.5K

310.8K

Alex Shtoff@AlexShtf·2d

@jm_alexia What would be a reasonable time to "do research" until a release?

English

1.1K

Alexia Jolicoeur-Martineau@jm_alexia·2d

I hate to say this but a lab that hasnt released a single thing in 2 year while stealth is not a good sign. I hope we'll see cool things from them soon since they got so much funding now.

Chris@ChrisGPT

You have to give it to Ilya and SSI > no leaks for 2 years > quiet as a mouse > no updates > July 27th - “We reached the point where our research is worth scaling” The most minimalist AI lab to grace the frontier

English

24.2K

Alex Shtoff@AlexShtf·3d

@tunguz He scales research :)

English

572

Bojan Tunguz@tunguz·3d

What did Ilya scale?

Ilya Sutskever@ilyasut

Time to scale that SSI:

English

330

28.9K

Alex Shtoff@AlexShtf·3d

@willccbb Either that, or they want more research (e.g., more experiments, faster feedback for experiments).

English

228

will brown@willccbb·3d

the age of research is over. now it’s the age of scaling

SSI Inc.@ssi

We are announcing a long-term strategic partnership with NVIDIA. NVIDIA is making a substantial investment in SSI that will let us 10x our compute in the next 12 months. We reached the point where our research is worth scaling and with this partnership we will be able to. We are honored by NVIDIA’s conviction.

English

1.6K

101.9K

Alex Shtoff@AlexShtf·3d

@CauraAI I know what they mean. I asked because they contradict your claim about a model being first or third.

English

Caura@CauraAI·3d

@AlexShtf They're there to show the gaps that actually separate models from the ones that are noise — where intervals overlap, we'd treat the ranking as a tie rather than a result.

English

1.4K

Caura@CauraAI·5d

Five frontier models, 100 questions, judged blind — by each other. Opus 5 wins at 8.87. Fable 5 takes third without being beaten: its own guardrail handed in 4 blanks — two were high-school biology. The numbers, the blanks, the self-bias ↓

English

1.1K

2.5M

Alex Shtoff@AlexShtf·4d

It appears agentic systems are going backward. First we had function calling. Now we have loops. Will we soon have `goto`?

English

143

Alex Shtoff@AlexShtf·4d

@MattLutzPhi If math was easy, we wouldn't need so much compute for AI.

English

183

Matthew Lutz@MattLutzPhi·4d

Beginning to suspect that math just isn't very hard.

English

420

32.1K

Alex Shtoff@AlexShtf·4d

@Mononofu @JensenHuang Are you aware of the VAST amount of open-source software and open-weights models that came from NVidia?

English

140

Julian Schrittwieser@Mononofu·5d

I’m so excited that @JensenHuang is a believer in open source now, looking forward to the CUDA and GPU driver open source release!

Jensen Huang@JensenHuang

For my first post, I’m sharing a letter @NVIDIA signed on why open models matter. AI will transform every industry, power every company, and be built by every country. Open models strengthen safety and cybersecurity, accelerate innovation and diffusion, and enable sovereignty. The world needs both frontier closed models and frontier open models. images.nvidia.com/pdf/Open-Weigh…

English

1.7K

352

5.4K

6.5M

Alex Shtoff@AlexShtf·4d

So, why doesnt X on Android work properly with folded phones and rotated screen? What a mess.

English

172

Alex Shtoff@AlexShtf·5d

@NadavFinebooch @Hak_Tsu @KseniaSvetlova לא הבנתי. אמריקה השתלטה על המוח של פוטין וגרמה לו לפלוש?

עברית

120

Nadav Finebuch@NadavFinebooch·5d

@Hak_Tsu @KseniaSvetlova בניגוד לנראטיב שספרו לנו והאמנתי בו, אמריקע חוללה את המלחמה באוקראינה לא פוטין. אגב אתה מכיר את העובדה שהשחקן המושחת, זלנסקי הוא בעצם יאיר נתניהו קטן, משתולל עם גברים , עושה באף תוך כדי שהוא שודד את המדינה שהוא החריב ?

עברית

239

Ksenia Svetlova كسنيا سفطلوفا@KseniaSvetlova·5d

ולנטינה (וליה) סביצקי, רעייתו של איגור סולוביי, חבר אוקראיני יקר, נהרגה אמש מפגיעה של טיל בליסטי רוסי. פגשתי אותה לפני כשנתיים כאשר ביקרו בישראל. דיברנו על המלחמות שלנו, ועל ההכרח לשתף פעולה (בעלה, איגור, ריכז את הפעילות הממשלתית נגד דיסאינפורמציה רוסית). המלחמה באוקראינה נמשכת כבר יותר מארבע שנים והיא גובה מחירים כבדים מהאוקראינים, יום ביומו. הפעם זאת מישהי שזכיתי להכיר.יהי זכרה ברוך.

Ksenia Svetlova كسنيا سفطلوفا tweet media

עברית

13K

Alex Shtoff@AlexShtf·5d

@DamienTeney @andrewgwils I believe these come mostly from intra-community papers.

English

Damien Teney@DamienTeney·5d

@andrewgwils I suspect there are also a lot of crap submissions where the authors are unknowingly reinventing the wheel and/or failed to connect their ideas with the existing literature.

English

369

Andrew Gordon Wilson@andrewgwils·5d

And these papers that do not belong to a community, which get the worst treatment in review, are the ones we most need.

Lénaïc Chizat@LenaicChizat

Always stunning to see how much the quality of NeurIPS reviews depends on the subcommunity a paper is sent to. But the worst fate is for papers that do not clearly belong to any community: they end up torn apart in a no-man’s-land of confident misunderstanding and bitterness.

English

12.1K

Alex Shtoff@AlexShtf·5d

@unclebobmartin @ori_pomerantz Who writes the unit tests?

English

Uncle Bob Martin@unclebobmartin·23 Tem

I’m significantly older than you. I started coding in the late 60s. My current strategy is to not read any of the code written by my agents. That’s the only way I can take advantage of their productivity. What I do instead is to surround the agents with extreme constraints. Unit tests, gherkin tests, QA procedures, quality metrics, mutation testing, test coverage, and a plethora of others. In the end, I have very high confidence in the code they produce because they’ve had to run the gauntlet of all of my constraints and tests.

English

564

1.9K

18.2K

4.9M

Ori Pomerantz@ori_pomerantz·22 Tem

I am trying to use Claude to help me write something, but I just don't feel comfortable letting it edit my files. Does anybody else feel the same? If I am responsible for code, I NEED to understand it, psychologically if for no other reason. Started programming in 1983. Old?

English

234

636

186K

Alex Shtoff@AlexShtf·5d

The main problem is that AOI is essentially software, and software is hackable. Of course, there's this issue of nonstandardness. For example, a paper whose objective is presenting a mathematical object that is widely applicable, rather than the standard problem--> solution --> evidence (empirical/theoretical) would probably be simply rejected. I got a few such non-standard papers as a reviewers, and I dont see how AI would perform well.

English

254

Peter Richtarik@peter_richtarik·5d

Every single review I have handled at NeurIPS as an AC is **much worse** than a well executed AI review. By "much worse" I do not mean 20% or 50%. I mean a factor of about 10-100. At least - 10x more mathematical issues are caught, - 10x more missing key citations are caught, - 10x more novelty claims are invalidated with evidence, - 10x more issues with the experiments are observed, - 10x more typos and grammatical issues are fixed, - 10x more inconsistencies are found, and so on. In many cases I have handled over the last 10 years, the human reviews are so bad in comparison that the improvement factor is closer to 100. (The best human reviews I received over the years are worse than an AI review I can generate today, but still entirely good enough to make the correct decision. On the other hand, an AI review is often an almost comprehensive summary of all key issues --- something a human almost never has time to deliver --- and such feedback is immensely useful to the authors) At this point, we would be much better to just 1) give one very thorough ChatGPT review to all submitted papers (I am mainly talking about my fields - optimization, AI, machine learning), automatically, and 2) keep asking for a revision or two (to keep the duration of the process within some bounds) until the number of issues decreases to an extent when the AI reviewer is satisfied, in a given time-frame (eg, 1 month). 3) At that point, a human AC can make a decision (to keep an eye on this all should anything go wrong). The role of the AC would be merely to observe and manage the process, and make a final decision, based on the trajectory of the revisions. I can't believe I am saying this -- AI reviews were a nonsense idea even a year ago. The current AI reviews are super-human.

English

308

131.1K

Alex Shtoff@AlexShtf·6d

@thsottiaux NO