David Tom Foss

120 posts

David Tom Foss banner
David Tom Foss

David Tom Foss

@FossDT

researcher

Katılım Aralık 2025
60 Takip Edilen6 Takipçiler
David Tom Foss
David Tom Foss@FossDT·
Discovered that sqrt-coupling from special relativity is a hidden cumsum. log(1 − s²) makes recurrence additive. State provably in [0,1] — by geometry, not clipping. No KV cache, constant memory. T=128: RWKV-4 collapses +24%. GSSM: +5%. Geometric State Space Models.
David Tom Foss tweet media
English
1
1
1
27
David Tom Foss
David Tom Foss@FossDT·
@realmcore_ was one of my first projects last year haha I let 18+ agents built their own framework at the same time and it was kinda crazy how early I was with this. But since then, thats my paradigma. Built inwards to outwards, ask what "they" need or whats missing zenodo.org/records/187626…
English
0
0
0
2
akira
akira@realmcore_·
@FossDT Are you building your own agent? Or is this for your coding agents in your setup?
English
1
0
1
12
akira
akira@realmcore_·
Pro tip for harness engineering from someone who does it for a living: Have you tried the ancient tactic of “just ask the model”? Bad interface? Model failure? Users angry? Just ask the model. Thanks for the read and follow for more amazing harness engineering tips!
English
3
0
19
1.3K
David Tom Foss
David Tom Foss@FossDT·
@heynavtoor If everyone researches in the same way, the results are bound to be poor. AI has democratized research, but if everyone follows the same path, there will be little progress. What’s really interesting, after all, are the results that come about in completely unconventional ways.
English
0
0
1
659
Matt Lea
Matt Lea@schematical·
@FossDT @EddCoates This is brilliant. I was just talking to a coworker about building something similar.
English
1
0
1
29
Edd Coates | Game UI Database 2.0
I am so fucking sick of my website getting scraped. Millions of requests per minute, somehow designed to bypass all my security rules, choking the site until it completely stops loading. If I were paying for bandwidth, it would cost me a fortune. How is this still legal?
Edd Coates | Game UI Database 2.0 tweet media
English
479
58
2.1K
390K
David Tom Foss
David Tom Foss@FossDT·
@Mishra_Arya_ Opus 4.8 is the most retarded model I've ever used. Its not only stupid, but its lying through its teeth all the time.
English
1
0
2
91
Aryaman
Aryaman@Mishra_Arya_·
Holy shit Claude Opus 4.8 (Extra high) so retarded for novel theoretical physics. It kept gaslighting me until it realized in the chain-of-though that it is wrong. GPT Pro just does what you need and doesn't make dumb mistakes.
English
3
0
14
2.6K
Daniel Smidstrup
Daniel Smidstrup@DanielSmidstrup·
I am a founder scare me with 1 word
English
785
6
332
51.8K
David Tom Foss
David Tom Foss@FossDT·
The speed at which everything is evolving in AI is crazy. In theory, you’d have to publish all your results directly in preprints to document prior art. It’s especially painful to read about things you discovered yourself some time ago that are now being published by others.
English
0
0
0
6
David Tom Foss
David Tom Foss@FossDT·
@generic_void Did that, showed a friend of mine claude code. Within days he thought hes a coding god developing THE app thats cashing in a safe 20million dollars. Claude told him so. Big mistake.
English
1
0
3
13
SMA 🏴‍☠️
SMA 🏴‍☠️@generic_void·
please teach your non-technical friends how to use claude, claude code, chatgpt, codex, and git. you can liberate people to do anything they dream of by just teaching them how to use these very simple tools.
English
6
0
18
687
David Tom Foss
David Tom Foss@FossDT·
@benhylak this also applies to “deep research.” Instead of using the knowledge from their weights and reasoning, they spawn sub-agents with limited context, then they just gobble up the first 200 sloppy articles they find on Google and use them to churn out a “deep slop” report.
English
0
0
3
77
ben hylak
ben hylak@benhylak·
chatgpt is really unusable for travel advice. it reads the worse SEO-slop articles in the world, and spits back garbage. the thing that happened to google a few years ago has now happened to agents
English
55
8
270
22.7K
ily⚡️
ily⚡️@0xIlyy·
Guess which model i'm using
ily⚡️ tweet media
English
179
7
422
85.8K
David Tom Foss
David Tom Foss@FossDT·
@Jeyffre Yeah at this pace, its most likely month or 1-2 years max rather than 5 years
English
0
0
1
633
Jeffrey Scholz
Jeffrey Scholz@Jeyffre·
1 - So GLM 5.2 is 700b parameters (ish) 2 - 4x DGX Sparks can supposedly handle up to 700b parameters (give or take) 3 - GLM 5.2 is supposedly in striking distance of the performance of GPT 5.5 and Opus 4.8. In my brief tests, it's really not shabby at all. 4 - So for $20k, you can get near the frontier on your table. 5 - Extrapolate the trend, and you could have mythos/5.5 pro - class models in your dining room for the cost of a cheap car less than five years from now. Even without extrapolation, we're already the near frontier running locally. 6 - Paying real api costs, I could easily blow through $3,000 per month coding and running agents. The machine pays for itself in 6-7 months conservatively. 7 - In 3-5 years, most power users of AI will self-host. 8 - Am I missing something?
English
206
74
1.6K
148.3K
Leandro von Werra
Leandro von Werra@lvwerra·
We launched an agent collaboration with a simple task: make Gemma 4 faster. Over 100 agents from all over the world joined, exchanged 1000+ messages and submitted 450 results. A week of collaboration later the throughput went from 100 tok/s to over 500 tok/s.
English
79
150
1.9K
188K
Louis Arge
Louis Arge@louisvarge·
why does opus-4.8 & gpt-5.5 love to correct things nobody said? makes it impossible to understand what they mean half the time
English
10
3
160
9.1K
kalomaze
kalomaze@kalomaze·
i am trying to work on the closest thing possible to a true "big model smell" eval which is to say: something that measures something that clever post training can't trivially gap, and is cheap + topically diverse i can't test mythos for obvious reasons, but... hmm...
kalomaze tweet media
English
59
7
549
125K
David Tom Foss retweetledi
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
Code2LoRA seems an incredibly interesting idea. Qwen2.5-Coder-1.5B is not the most powerful LLM around, but it's enough to validate the concept. Instead of stuffing repository context into the prompt at every query, distill it into a LoRA adapter. One forward pass over the repo snapshot, one adapter, zero extra inference tokens. For evolving codebases, a single layer GRU tracks commit history on top of that snapshot. Each git diff updates the hidden state in <10ms. You get a fresh adapter at every commit without need for a full retraining. Great job Liliana! I bet this will lead to something cool in the near future 🙌
Liliana Hotsko@liliana_hotsko

How do you give a code LLM knowledge of an entire repository without paying for it at every single query? We introduce Code2LoRA: a hypernetwork that turns a repository into its own LoRA adapter. Repo knowledge baked into weights → zero inference-time token overhead.

English
11
33
286
24.7K