G. @ The Neuron

5.3K posts

G. @ The Neuron

G. @ The Neuron

@TheNeuronScribe

I am dumb but I am learning

Sumali Temmuz 2024
4.4K Sinusundan107 Mga Tagasunod
G. @ The Neuron nag-retweet
DailyPapers
DailyPapers@HuggingPapers·
Vision2Web Evaluating coding agents on 193 real-world tasks across static, interactive, and full-stack development, with automated verification via GUI agents and VLM judges.
DailyPapers tweet media
English
2
7
30
3.6K
G. @ The Neuron nag-retweet
Nick Dobos
Nick Dobos@NickADobos·
Codex / claude code pro tip: NEVER RESUME A CONVERSATION AFTER HITTING THE LIMIT. Always new chat. If your last chat was using 500k of the 1mil window, you will nuke 50% of your usage with a single "hello" message. Caching is weird. If you need that context tell the AI to go read & summarize the previous thread.
English
38
6
216
16.7K
G. @ The Neuron nag-retweet
Harper Carroll
Harper Carroll@HarperSCarroll·
More Details on Anthropic’s Leaked Code — PART 1 Anthropic accidentally exposed Claude Code’s agentic source code, and here's what we can learn from it. what happened Due to a human misstep during publishing, a file inside Anthropic’s shared code update contained internal source code. ~512,000 lines of code ~1,900 TypeScript files Not a hack, just a packaging error – resulting in publicizing the IP of one of the greatest AI technologies ever made. how this technology actually works What we already knew & what the leaked code reveals about how Claude Code works. 1. tools: how Claude takes actions Claude doesn’t just generate text; it uses built-in abilities called “tools” to take action. See attached tools tables for examples. 2. it’s a cycle AI agents aren’t magic. Claude repeats a loop: 1.Receives your prompt 2.Analyzes prompt to determine best tool(s) 3.Runs those tools 4.Injects results from tool use into the conversation history (“context window”), so the model can see/reference it 5.Repeat until no more tools are needed 3. in simplest terms An AI agent is: a large language model + tools in a loop that keeps going until it determines that the task is done The “intelligence” is Claude. The “agent” part is the loop. what wasn’t leaked The large language models themselves (there are multiple Claude models) – weren’t leaked, just the tools. The weights, architectures, training data & training pipelines of these neural networks are still secret. & good thing. Those cost hundreds of millions of dollars of compute to create. (Pop quiz: what are open vs. closed-source models? Comment below!) engineering craft The loop itself isn’t the whole story; in fact, that’s been public. What the leaked code reveals is just how much additional, complex scaffolding goes into making that loop reliable at scale, like: ∙system prompt engineering ∙context compaction to stay in token limits ∙how tools are designed & sandboxed ∙permission modeling & much more We’ll cover more in the next post. was this helpful? Did you learn anything? Have any questions? What should I cover next? Let me know in the comments!
Harper Carroll tweet media
English
1
2
15
773
G. @ The Neuron nag-retweet
rohin
rohin@rohinlohe·
25 years ago, Google helped create the first economic model for the web — ads. Today, that model is changing and we (@Cloudflare) are excited to enable every developer and site owner to shape how the world transacts. Honored to be a steward, alongside iconic businesses such as Coinbase, Stripe, Visa, Mastercard, Google, Microsoft, and many more. A special thank you to @programmer and @kleffew94 for their vision, and to Coinbase for their openness to making this an open protocol. Get started today & send your feedback my way: developers.cloudflare.com/agents/agentic…
Coinbase 🛡️@coinbase

x.com/i/article/2039…

English
3
13
66
14.1K
G. @ The Neuron nag-retweet
Felix Rieseberg
Felix Rieseberg@felixrieseberg·
Computer Use is now available on Windows! This gives Claude on Windows the ability to control your keyboard and mouse. It's really effective at letting Claude handle legacy apps.
English
40
10
213
9.2K
G. @ The Neuron nag-retweet
Lydia Hallie ✨
Lydia Hallie ✨@lydiahallie·
Digging into reports, most of the fastest burn came down to a few token-heavy patterns. Some tips: • Sonnet 4.6 is the better default on Pro. Opus burns roughly twice as fast. Switch at session start. • Lower the effort level or turn off extended thinking when you don't need deep reasoning. Switch at session start. • Start fresh instead of resuming large sessions that have been idle ~1h • Cap your context window, long sessions cost more CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000 We're rolling out more efficiency improvements, make sure you're on the latest version. If a small session is still eating a huge chunk of your limit in a way that seems unreasonable, run /feedback and we'll investigate
English
277
55
1K
276.9K
G. @ The Neuron nag-retweet
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
We’ll soon be able to do this in Claude Code: “Claude, cure my cancer. Make no mistakes.”
Yuchen Jin tweet media
English
24
11
182
7K
G. @ The Neuron nag-retweet
stevibe
stevibe@stevibe·
Gemma4 just dropped. How does it handle tool calls? I ran ToolCall-15 across the full Gemma4 families. Gemma4 31b = Qwen3.5 27b. Both perfect 15/15. But here's what's wild: Qwen3.5 9b already clears 13/15, Gemma4 needs 26b to match that.
English
29
33
354
31.1K
Christoph Nakazawa
Christoph Nakazawa@cnakazawa·
I think I'm at a breaking point with LLM text. ChatGPT's language has become the worst. I have full on AI fatigue. The honest truth Why this fixes it (short answer) Clean fix The safest bet Final honest take Best-case scenario (totally possible) My straight recommendation Bottom line (no sugarcoating) If you want, tell me […] and I’ll tell you what I’d personally do in your exact situation (not generic advice) Instead of asking “[…]”, think: 👉 “How do I maximize […]?” My honest recommendation (based on what you said) Let me be real with you upfront Here’s the pro move That’s actually a really good question—let’s sharpen it so it actually makes sense. Still real. Not peak performance That’s not just […]. That's […] I wrote the first 3 myself, but then I went to a chat and just kept copying more examples. People don't write like this. Are we doomed to have to read the same poor sentence structure and wording for the rest of our lives? It's even worse when I have to read other people's llm slop. Thank you, I can prompt an llm myself. Do I have to pay a person to operate the llm for me and write back slowly in human language?
English
197
80
1.6K
93K
G. @ The Neuron nag-retweet
claire vo 🖤
claire vo 🖤@clairevo·
Yep. See this over and over. You need tools, sure. But you really need: - culture change - technical readiness - a new operating model And it’s hard to do if you haven’t figured it out in the senior ranks. April 18-19 I’m teaching a small cohort of execs how: maven.com/clairevo/ai-na…
Brianne Kimmel@briannekimmel

A dangerous pattern for companies today is assuming signing up for a bunch of AI tools is a strategy. Every company needs to map out exactly what problems need to be solved and determine what products exist today & where custom agents need to be built.

English
4
4
86
13.6K
G. @ The Neuron nag-retweet
Weizhuo(Ken) Wang
Weizhuo(Ken) Wang@KenWangWeizhuo·
A person walks around campus for 5 hours with cameras. That's it. That's the training data. The result? A humanoid robot that traverses unseen buildings, crowds, and glass walls — zero robot data, zero finetuning. EgoNav is here. egonav.weizhuowang.com None of these behaviors were pre-programmed: • Waiting for a door to open before entering • Steering around glass walls invisible to depth sensors • Yielding to pedestrians and resuming • Re-routing when furniture is rearranged All emerged from 5 hours of a human walking around. The prior is real. (1/6) #Humanoid #Robotics #DiffusionModel #EgoNav
English
12
60
269
29.6K
Pietro Schirano
Pietro Schirano@skirano·
If you thought the Chinese models were good, just wait a couple of months, now that all the distillation poisoning has been removed from Claude Code.
English
19
10
575
68.2K
G. @ The Neuron nag-retweet
atomic.chat
atomic.chat@atomic_chat_hq·
Running Hermes agent Locally with Gemma4 Device: Macbook Air CPU: M4 RAM: 16GB Open Source. Free. Private. With TurboQuant cache in @Atomic_Chat_HQ app
English
34
75
808
123K