Hugo

286 posts

Hugo banner
Hugo

Hugo

@Jtseig

NYC Katılım Eylül 2021
1.7K Takip Edilen160 Takipçiler
Hugo retweetledi
Hugging Models
Hugging Models@HuggingModels·
NVIDIA just dropped PersonaPlex-7B 🤯 A full-duplex voice model that listens and talks at the same time. No pauses. No turn-taking. Real conversation. 100% open source. Free. Voice AI just leveled up. huggingface.co/nvidia/persona…
English
206
1K
7.6K
1.4M
Hugo retweetledi
Riley Coyote
Riley Coyote@RileyRalmuto·
holy sh*t. this is hands down the coolest website i have ever found in my life. it's a live feed of the freaking Hubble Telescope AND James Webb Space Telescope. and the resolution is honestly so incredible i didn't think it was real. unbelievable. spacetelescopelive.org
English
436
6.2K
42.7K
2M
Hugo retweetledi
alex zhang
alex zhang@a1zhang·
Much like the switch in 2025 from language models to reasoning models, we think 2026 will be all about the switch to Recursive Language Models (RLMs). It turns out that models can be far more powerful if you allow them to treat *their own prompts* as an object in an external environment, which they understand and manipulate by writing code that invokes LLMs! Our full paper on RLMs is now available—with much more expansive experiments compared to our initial blogpost from October 2025! arxiv.org/pdf/2512.24601
alex zhang tweet media
English
252
1.1K
7.4K
2M
Hugo retweetledi
Vishal Misra
Vishal Misra@vishalmisra·
New work: Do transformers actually do Bayesian inference? We built “Bayesian wind tunnels” where the true posterior is known exactly. Result: transformers track Bayes with 10⁻³-bit precision.
And we now know why. I: arxiv.org/abs/2512.22471 II: arxiv.org/abs/2512.22473
🧵
English
55
243
1.8K
422K
Hugo retweetledi
Jackson Atkins
Jackson Atkins@JacksonAtkinsX·
My brain broke when I read this paper. A tiny 7 Million parameter model just beat DeepSeek-R1, Gemini 2.5 pro, and o3-mini at reasoning on both ARG-AGI 1 and ARC-AGI 2. It's called Tiny Recursive Model (TRM) from Samsung. How can a model 10,000x smaller be smarter? Here's how it works: 1. Draft an Initial Answer: Unlike an LLM that writes word-by-word, TRM first generates a quick, complete "draft" of the solution. Think of this as its first rough guess. 2. Create a "Scratchpad": It then creates a separate space for its internal thoughts, a latent reasoning "scratchpad." This is where the real magic happens. 3. Intensely Self-Critique: The model enters an intense inner loop. It compares its draft answer to the original problem and refines its reasoning on the scratchpad over and over (6 times in a row), asking itself, "Does my logic hold up? Where are the errors?" 4. Revise the Answer: After this focused "thinking," it uses the improved logic from its scratchpad to create a brand new, much better draft of the final answer. 5. Repeat until Confident: The entire process, draft, think, revise, is repeated up to 16 times. Each cycle pushes the model closer to a correct, logically sound solution. Why this matters: Business Leaders: This is what algorithmic advantage looks like. While competitors are paying massive inference costs for brute-force scale, a smarter, more efficient model can deliver superior performance for a tiny fraction of the cost. Researchers: This is a major validation for neuro-symbolic ideas. The model's ability to recursively "think" before "acting" demonstrates that architecture, not just scale, can be a primary driver of reasoning ability. Practitioners: SOTA reasoning is no longer gated behind billion-dollar GPU clusters. This paper provides a highly efficient, parameter-light blueprint for building specialized reasoners that can run anywhere. This isn't just scaling down; it's a completely different, more deliberate way of solving problems.
Jackson Atkins tweet media
English
344
2K
11.8K
2.2M
Hugo retweetledi
Patrick Collison
Patrick Collison@patrickc·
Good post on evolving stablecoin market structure. I would extend it further: yes, I think that stablecoin issuers are going to have to share yield with others, but this is just one instance. Everyone is going to have to share yield. Today, the average interest on US savings deposits is 0.40% (FDIC data), and $4T of US bank deposits earn 0% interest.* Things aren't better in the EU: 0.25% average interest on non-corporate deposits; corporate deposits just 0.51%.** In my view, this is going to change: depositors are going to (and should!) earn something closer to a market return on their capital. (Some lobbies are currently pushing, post-GENIUS, to further restrict any kinds of rewards associated with stablecoin deposits. The business imperative here is clear -- cheap deposits are great -- but being so consumer hostile feels to me like a losing position.) * See FRED's memorably-titled QBPBSTLKDPDOFFDPNIDP time series. ** MIR.M.U2.B.L21.A.R.A.2250.EUR.N and MIR.M.U2.B.L21.A.R.A.2240.EUR.N from the ECB.
nic carter@nic_carter

x.com/i/article/1973…

English
84
125
1.2K
595.1K
Hugo retweetledi
François Chollet
François Chollet@fchollet·
AGI will not be an algorithmic encoding of an individual mind, but of the process of Science itself. The light of reason made manifest.
English
65
47
647
49.8K
Laura Wendel
Laura Wendel@Lauramaywendel·
Vibe coding is broken. You get pretty designs for quick dopamine hits. But nothing that’s actually useful – until now. Introducing Endeva Apps, the most powerful idea to app platform – now in early preview. Comment below and I will send you an invite code!
English
605
58
866
162.4K
Hugo
Hugo@Jtseig·
API: Collect info from a website/platform AI Agent: Collect info from everything else
English
0
0
4
38
Hugo retweetledi
Quant Science
Quant Science@quantscience_·
Financial Statement Analysis with Large Language Models (LLMs) A 54-page PDF:
Quant Science tweet media
English
14
291
1.7K
157.7K
Hugo retweetledi
near
near@nearcyan·
intentionally making a typo in the 13th word of my prompt to push\ the llm's CoT 3 nats to the left in latent space, one-shotting my entire feature
near tweet media
English
35
79
2K
79.1K
Beff (e/acc)
Beff (e/acc)@beffjezos·
Thermodynamic post-atheism. Let Physics be thy God.
English
109
20
407
44K
Hugo retweetledi
François Chollet
François Chollet@fchollet·
Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks. It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task in compute ) and 87.5% in high-compute mode (thousands of $ per task). It's very expensive, but it's not just brute -- these capabilities are new territory and they demand serious scientific attention.
François Chollet tweet media
English
203
1.6K
8.7K
2.2M
Hugo retweetledi
Bilawal Sidhu
Bilawal Sidhu@bilawalsidhu·
20+ labs just worked together to open source Genesis - a physics engine with a VLM agent that turns text prompts into interactive 4D worlds Think instant physics-accurate environments, camera paths, and character animations - all from natural language 🤯
English
38
205
1.5K
177.3K
Hugo retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
People have too inflated sense of what it means to "ask an AI" about something. The AI are language models trained basically by imitation on data from human labelers. Instead of the mysticism of "asking an AI", think of it more as "asking the average data labeler" on the internet. Few caveats apply because e.g. in many domains (e.g. code, math, creative writing) the companies hire skilled data labelers (so think of it as asking them instead), and this is not 100% true when reinforcement learning is involved, though I have an earlier rant on how RLHF is just barely RL, and "actual RL" is still too early and/or constrained to domains that offer easy reward functions (math etc.). But roughly speaking (and today), you're not asking some magical AI. You're asking a human data labeler. Whose average essence was lossily distilled into statistical token tumblers that are LLMs. This can still be super useful ofc ourse. Post triggered by someone suggesting we ask an AI how to run the government etc. TLDR you're not asking an AI, you're asking some mashup spirit of its average data labeler.
English
546
1.8K
13.3K
1.8M