
Joseph Gardi
219 posts


@DimitrisPapail also claude code has a button to make any terminal task go to the background
English

@DimitrisPapail just ask it to use tmux. Without any further instruction the models already know to repeatedly do sleep 30 && tmux capture-pane
English

Single-threaded agents waste test-time compute.
I’ve seen this repeatedly in my own work: Claude Code and Codex kick off a GPU run or a long terminal command, then sit idle waiting for a response. So the model is blocked until the environment responds, then reasons about the result and next steps.
Why?
That idle time is wasted test-time compute. The model isn’t thinking while it could!
The agent should be pipelining eg while waiting on experiment results, it could be planning the next set of experiments, exploring alternative hypotheses, running a supporting subtask in the background.
There is no reason for serial execution when the bottleneck is “the environment”.
This is not multitasking or “agent swarms” but a single agent making use of its own idle cycles to increase effective test time compute per second of wall-clock time.
During that time the model could instead be simulating likely outcomes of the running experiment and pre-planning its next action. Something like speculative execution, but for multi turn reasoning…
English

On January 23, I’ll be free soloing Taipei 101 in Taiwan. It’s been a long time goal of mine and it’ll be the most ambitious urban climb that I’ve attempted. It’s a nearly 1,700 ft tower! What’s not to like?! And I’ll be doing it LIVE on @netflix. Tune in Friday, January 23 at 8PM ET / 5PM PT

English

@GergelyOrosz only 2 acceptable answers. pragmatic programmer and 23 design patterns
English

@jiratickets sounds like someone who will be forgotten in history
English

@CalebPeffer but if everyone applies this logic then we would only have developer tools and no real world use case. Just software for the sake of software. Meanwhile, there are wide open opportunities if you look outside the silicon valley bubble
English

@AlexHormozi but what if you know your startup is just a gimmick?
English

@GanatraSoham @composio how do you get the id of an existing google doc?
English

Setup MCP on Cursor with Google Docs in less than 2 mins!!
I used Cursor to to create PRDs in Google Docs
Here's how you can do it too:
- Go to the @composio MCP directory
- Search for Google Docs and grab your sse url
- Paste the url and set up MCP in Cursor
- Use Cursor Agent to authenticate and create PRD
Check out the 100+ tools available at mcp.composio.dev
English

Tether paid $2 in fees to move $1.5B.
This is a 0.000000001% fee. Possibly the lowest relative transaction fee ever paid on any transaction in the history of human finance.
The Bitcoin blockchain is massively underpriced.
Paolo Ardoino 🤖@paoloardoino
Tether Group is moving 14000 BTC to address bc1q8qpfmpf6hcu3tgfvp8dgtf534rws8uhsl9vtk6p2f3r2gnqdz5sqxmty6q as part of its investment in Twenty One Capital (XXI) mempool.space/address/bc1q8q…
English

A lot of people are ignoring that Go is becoming a commonly used language for prompting pipelines. Python in prototypes and Go in production is another common combo.
Viktor Eriksson@cviktore
Me and the team at @lovable just spent two months rewriting 42,000 lines of code from Python to Go. Technical deep dive of why we did it +what this means: // 1
English

@bo_wangbo @spyced by the way, I noticed some strange results with late chunking. Providing more context hurt accuracy, and setting the task to retrieval hurt accuracy. Can share details if you'd like
English

@sama Tbh annie has persuaded me you should be treated with caution, but it has little to do with her object level claims and everything with my belief in substantial heritability of personality and mental disorders.
English

@rohanpaul_ai This paper is making a mockery of the benchmarks. it's basically just saying that the datasets are so small that you can copy the whole dataset into chatgpt. Makes all the previous papers on those benchmarks look ridiculous
English

Precomputed key-value caches make knowledge retrieval 40x faster than traditional RAG.
Cache-augmented generation replaces traditional retrieval-augmented generation by preloading documents and precomputing key-value caches, making knowledge tasks faster and more accurate.
-----
🤔 Original Problem:
Traditional RAG systems suffer from retrieval latency, errors in document selection, and complex system architecture that requires careful tuning and maintenance.
-----
🔧 Solution in this Paper:
→ The paper introduces Cache-Augmented Generation (CAG), which preloads all relevant documents into LLM's memory before inference.
→ CAG precomputes key-value caches from documents, storing them for future use rather than retrieving during runtime.
→ The system operates in three phases: external knowledge preloading, inference with cached context, and efficient cache reset.
-----
💡 Key Insights:
→ Eliminating retrieval during inference dramatically reduces response time and system complexity.
→ Preloading context enables holistic understanding across all documents.
→ CAG works best when document collections fit within LLM context windows.
-----
📊 Results:
→ CAG achieves highest BERT-Score (0.7759) on HotPotQA, outperforming both sparse and dense RAG systems.
→ Generation time reduced from 94.34s to 2.32s on large datasets.
→ Consistent performance improvement across both SQuAD and HotPotQA benchmarks.

English

@VictorTaelin But not everything can be proven. Some things are imperical
English


















