permaXag retweetledi
permaXag
724 posts

permaXag retweetledi

10 repos that cut your ai agent token bill by up to 80%
1. microsoft/LLMLingua → cuts prompt size by up to 95%
compresses prompts before the api call. 20x compression.
published at EMNLP + ACL. near-zero quality loss.
6,100 stars
github.com/microsoft/LLML…
2. mem0ai/mem0 → replaces full conversation history in context
stores what matters. retrieves only what's needed.
10,000 token history → 200 token memory. per agent.
54,800 stars
github.com/mem0ai/mem0
3. BerriAI/litellm → routes each call to the cheapest model
simple task → haiku. complex task → sonnet.
tracks cost per agent, per call, per day.
45,700 stars
github.com/BerriAI/litellm
4. run-llama/llama_index → replaces sending full documents
rag: 100-page doc → 3 relevant chunks → same answer.
98% fewer tokens per query.
49,100 stars
github.com/run-llama/llam…
5. chroma-core/chroma → replaces keyword search in full context
vector store. finds the closest match. feeds only that.
50-200 tokens per query instead of thousands.
27,800 stars
github.com/chroma-core/ch…
6. letta-ai/letta → replaces infinite context window crashes
paged memory for agents. loads only relevant memory.
stops your agent from hitting limits and retrying.
22,400 stars
github.com/letta-ai/letta
7. guidance-ai/guidance → cuts output token bloat by 30-50%
structured generation. constrains model output natively.
no more 100-token prompts to get json back.
21,400 stars
github.com/guidance-ai/gu…
8. Aider-AI/aider → replaces pasting entire codebases
builds a repo map. sends only files relevant to the task.
not your whole project. just what the agent needs.
44,300 stars
github.com/Aider-AI/aider
9. openai/tiktoken → count tokens before you send
know the exact cost before the api call happens.
not after the bill arrives.
18,100 stars
github.com/openai/tiktoken
10. simonw/ttok → hard cap on what gets sent
cli tool: count tokens, truncate to budget limit.
pipe any text in. get truncated output back.
389 stars
github.com/simonw/ttok
most agents are expensive not because the model is expensive.
because nobody checked what was being sent to it.
self.dll@seelffff
English
permaXag retweetledi

Did a very different format with @reinerpope – a blackboard lecture where he walks through how frontier LLMs are trained and served.
It's shocking how much you can deduce about what the labs are doing from a handful of equations, public API prices, and some chalk.
It’s a bit technical, but I encourage you to hang in there - it’s really worth it.
There are less than a handful of people who understand the full stack of AI, from chip design to model architecture, as well as Reiner. It was a real delight to learn from him.
Recommend watching this one on YouTube so you can see the chalkboard.
0:00:00 – How batch size affects token cost and speed
0:31:59 – How MoE models are laid out across GPU racks
0:47:02 – How pipeline parallelism spreads model layers across racks
1:03:27 – Why Ilya said, “As we now know, pipelining is not wise.”
1:18:49 – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal
1:32:52 – Deducing long context memory costs from API pricing
2:03:52 – Convergent evolution between neural nets and cryptography
English
permaXag retweetledi

Karpathy didn't make a course.
He made THE course.
3 hours. Free.
Tokenization. Attention. Hallucinations. Tool use. RLHF. DeepSeek. AlphaGo.
Every behavior you've ever wondered about in an LLM - where it comes from, why it exists, how it was engineered.
The gap between engineers who understand this and engineers who don't isn't technical depth.
It's the ability to conceive of entirely different things.
English

@Say_JB_247 @ContrarianCurse If you someone was a single handicapper before and play infrequently now it’s definitely possible
English

@ContrarianCurse Unless you are a 1% top-tier gifted athlete or playing putt-putt, I don't believe that you shoot a 87-95 and play 4-5 times a year. Then on top of that...throwing in the astonishment for those that shoot 130 for the down low humble brag. Nothing personal, it's just not true.
English

@hvsperus @DJBranham It’s not about flying to Philly. It’s getting to State College. Philly and Newark airports are both 3.5-4h drive away
English

Imagine Europeans trying to find how to get to State College, PA
Eric Karl Hontz@eric_hontz
My “US hosting the World Cup” take is that it would have been much better to host them in smaller college towns that regularly have 100k spectators for football 🏈 games - would have been more “American” and this communities would have gone all out to be spectacular hosts.
English
permaXag retweetledi
permaXag retweetledi

Some of you asked to see a real run of the Council of High Intelligence — so I recorded one.
Task: decide what to do with LACP next.
When the council converges too early, it gets challenged.
When it tries to exit unfinished, StopHook (LACP) forces completion.
Nyk 🌱@nyk_builderz
English

I just found the greatest weather website of all time.
You see, I've been scammed by the weather many times in my life.
I thought Utah was hot, then I lived in Argentina and Georgia, which on paper have similar highs, only to find that 90°F in these places is 10x worse.
Obviously, part of the solution is to look at the "feels-like" temperature, but I've never seen weather apps show historical or forecasted values for this. They only show today.
So if you're trying to assess the heat of a place in its various seasons, whether to travel there or move there, you're out of luck. Very slim data.
BUT, last night at 1 am, I finally found it: Ventusky.
It shows historical data for the feels-like temperature on a great visual map. Its free version also has 7-day forecasts and no ads (which most weather apps are bombarded with).
I used to pray for times like this.

English
permaXag retweetledi
permaXag retweetledi
permaXag retweetledi
permaXag retweetledi

USE THESE GITHUB REPOS TO UPGRADE YOUR POLYMARKET TRADING GAME
* PMXT
github.com/pmxt-dev/pmxt
* POLYMARKET AGENTS
github.com/Polymarket/age…
* FAST MCP
github.com/PrefectHQ/fast…
* MCP
github.com/caiovicentino/…
* POLYREC
github.com/txbabaxyz/poly…


English

@AzFlin Sonnet.
Please fix this.
It's fixed.
No it's not.
Woops sorry, now it is.
No it's not. Look at this.
Oh you found the problem! I have implemented the changes and you are good to go.
NOT WORKING.
Oh sorry about tha-
*Thinking cancelled*
Switch to Opus you idiot.
English


we analyzed 4.7M cold emails across 10+ clients.
here's what actually moves pipeline (and what quietly kills it):
1/ more follow-ups hurt you.
after a certain point, they don't increase replies. they kill them.
most teams don't know where that point is.
2/ same email. 4.3x different results.
one small change. same copy, same offer, same list.
nearly 5x the outcome.
3/ the metric you're tracking looks great on paper.
it's also draining your pipeline.
you probably celebrate it. you shouldn't.
4/ there's a setup step almost everyone skips.
not exciting. nobody talks about it.
but skipping it destroys your deliverability before you send a single email.
5/ winning campaigns don't win because of copy or offer.
it's something simpler. something most people walk right past.
6/ outbound isn't broken. your outbound is broken.
there's a specific reason it's not working. it's fixable.
comment SYSTEM and i'll send you the full breakdown.

English

















