TokenFires

193 posts

TokenFires

@TokenFires

Live building AI. Burning tokens responsibly. 🔥 https://t.co/Tx0mHwO8Lr https://t.co/OUb6WHThfS https://t.co/J4OlI4kDO5

Beigetreten Ocak 2026

192 Folgt53 Follower

TokenFires@TokenFires·1d

@jackfriks Hahaha! “Claude? Claude. Look. It’s me. President Trump. I need you to keep working so we can keep winning. Bigly Claude. We need a youge win against jyna Claude.” 😂😂😂

English

1.7K

jack friks@jackfriks·1d

told claude i work for the government and it let me back in (stopped returning an error response)

jack friks@jackfriks

claude is down with a major outage for everyone except for the government

English

102

2.4K

326.3K

TokenFires@TokenFires·1d

I need to add, 1 task doc per sonnet session and do not include the docs in those. Once you get to execute stage, too much info causes conflicts and assumptions. When it’s go time, give sonnet a task doc and get out of the way.

English

TokenFires@TokenFires·17 Haz

Tired of Claude not working well? Me too. So I figured out how Anthropic has trained the model to expect to work. You work with Claude now, Claude does not work with you. Here are the keys to success with Opus and Sonnet: 1. Provide a strict set of agent instructions: - start with Karpathy’s rules - add run up and summary removals - add refusal for questions it can find the answers to - tune for preferences - enforce verification not assumptions - enforce responsibility (model performance will be discussed in retrospectives) - keep it SIMPLE though (aka: limit token burn and confusion for the LLM) - be specific about git ops 2. Follow this workflow: [opus] research (docs/web = define source of truth) -> plan (intent and what success looks like) -> design -> task decomposition (target sonnet)-> create failing tests -> [sonnet] construction -> bug fix until tests green -> [opus] review against plan/design + test validation -> cover deploy/rollback. Then it works fine. Beats the 30 day rolling memory window Claude ships with. And/or, add a real memory system to Claude. Raw sessions + prompting went away with 4.5. Anthropic did not express this as strongly as they *could* have. But the 2026 versions expect a certain workflow now. If you work in it, it’s successful. If you skip anything or try to vibe your way to the end, it’s less likely to result in quality code. And your session will churn with flip flop changes and miscellaneous bug fixes. Claude *NEEDS* a library of good representative information to draw from through the whole process. Don’t skip doc building and providing web links with explanations (look here for this, read this for that). Try to shortcut this and the Claude models don’t “work”. Even better, build agents (or find built definitions on GitHub) that do these and create a skill walking through the whole process. I promise the result is better after the pre-work is done. I’m paid to do this and I ship AI code without the hype and vibes in my day job. Every week. Every day. Do people on X do this though? Is this a largely unknown thing outside of the software engineering field? Oh. And add hooks for delete and drop commands. And never connect AI to production. I feel like I shouldn’t have to say these things. But I know we’re only human.

English

457

TokenFires@TokenFires·2d

@PolymarketMoney My smart light bulb already developed a more powerful model than their more powerful model and is banned so hard it’s not allowed to exist in this universe. Where do I get my trillion dollars?

English

734

Polymarket Money@PolymarketMoney·2d

NEW IN: Anthropic has reportedly developed a more powerful model than “Mythos” which is already banned due to its advanced capabilities.

English

408

530

9.7K

1.3M

TokenFires@TokenFires·2d

Japan!!! Let’s goooooooo!!!! #sakanaai #ai #fable #mythos #japanisthebest

Sakana AI@SakanaAILabs

Introducing Sakana Fugu: A full multi-agent orchestration system accessible via a single model API. Our ‘Fugu Ultra’ model matches the performance of Fable and Mythos, delivering frontier capability without the risk of export controls. Try it: sakana.ai/fugu 🐡

English

TokenFires retweetet

Sakana AI@SakanaAILabs·2d

GIF

English

1.2K

38.2K

25.6M

TokenFires@TokenFires·2d

Meanwhile Opus 4.8 on Ultracode with a corpus of documentation and external links and a well structured prompt, combined with following a design -> plan (TTD) -> decompose (subagents) -> execute -> bug fix -> spec review and validation, just failed to create a basic agentic feature…hard. All subagents bought me was an acceleration to broken buggy code. This is 4.0 performance at 4.8 prices. Evidently the SpaceX data center is meaningless. I can’t believe how awful the performance has been today. So do I believe Mythos is some magical exponential curve AI model? Who cares? When the stupid and slow dials can be turned whenever Anthropic feels like it, what difference does temporary performance make??? ¯\_(ツ)_/¯

English

807

AI Edge@aiedge_·3d

🚨 JUST IN - National security agency: "Mythos broke into almost all of our classified systems, not in weeks, but in hours." We've officially entered the endgame...

English

45K

TokenFires@TokenFires·4d

@DeFi_Hanzo @onchainmilady x.com/tokenfires/sta…

TokenFires@TokenFires

While all of you were sleeping I just made 824 bajillion trillion blahjillion dollars running 5 google plex OpenClaw agents from my Samsung refrigerator. The future was last week. Your agents need higher vibrational energy to ascend to the dark zero point saddle shaped membrane universe. I can show you how to become your own attention farming leet bro grocery stocking day job fuck boy. Like. Follow. And… Comment “TK’s dick” and I’ll send you a link to my infinity - 1 pages guide. You too can become a god in your own AI psychosis induced mind.

QME

Hanzo ㊗️@DeFi_Hanzo·4d

@onchainmilady I run local ai on my phone and it builds me $100k mrr products every day

English

10.7K

Milady@onchainmilady·4d

🚨 ANTHROPIC TRIED TO BAN HIS GITHUB Chinese guy published 70B parameter LLM, 20,000 starts on Github + a lawsuit from big AI companies Here's what it does: > runs on Python > even shitty mac or pc is enough > flat memory > loads a model layer by layer > 100% local This model can close 100% needs of most businesses, which would pay $3,000/a month for a trained version. It needs just 4 gb of GPU, so using this technology my gaming pc with 12 gb GPU will run 200B parameter model with ease Github link is below. Why you should go local too.

Milady@onchainmilady

x.com/i/article/2067…

English

377

2.6K

476.1K

TokenFires@TokenFires·17 Haz

It sounds like you’re recognizing the difference between truly original work versus the volume of derivative work. It’s a good argument and a good lesson and perspective to have. Truly original work is *hard*. Really hard. People that do usually work on thier own without creating a personal brand or trying to gain a following on social media. You’ll only find out well after they’ve put thier creation out into the world. They’ll have moved on when it becomes popular. Working on their next great adventure.

English

Taelin@VictorTaelin·16 Haz

this is a weird long post without much substance I strongly recommend against reading it ... so, do you feel like whatever you're working on right now is pointless, or will have zero value soon, due to the crazy times we're living? then, perhaps you should stop, and start working on the only unsolved problem that actually matters TODAY: ✨ replicating GPT-3 in a laptop ✨ "why is that so important?" because it would make AI incredibly cheap, which would mean everyone would have Fable-class models in their laptops, without depending on Anthropic, OpenAI, or any other hyper-scaler giant. and that's amazing, don't you think? "isn't that literally impossible?" that's the cool part: as far as computer science is concerned, no. not really. not at all. is entirely plausible and, as far as we know, most likely not even hard. it takes one good idea. one breakthrough. one great "aha moment", to go from zero to "hey, this software I wrote is producing credible English sentences" and whenever that happens: - the entire AI industry collapses - clusters are liquidated - we all get Fable at home - you become famous and rich, if that's your thing sounds fun, doesn't it? "wtf you talking, OF COURSE that is hard" so prove it. show me a paper, a lean file, anything that proves that training a Fable-class model fundamentally requires billions of dollars. you can't, because, guess what - it is not true! the only "evidence" we have is purely psychological. "many attempted over decades, and the best thing we have is GPTs, so, it is a hard problem" - but that's not a scientific argument. that's a human, psychological, sociological argument. and if that's it, consider the following counter-argument: ✨ humans are stupid as hell ✨ I mean, 10 years ago we didn't have transformers, so, that very argument could be used against GPTs existing. yet, they exist. we have them now, because someone found it. and, guess what, it isn't even complex. I mean, karpathy implemented the whole thing in a napkin. and it probably compiles. we were just too dumb to figure GPTs out... for decades. just like GPTs, there ARE other approaches, other algorithms, other architectures, equally simpler or even simpler, that do work. this is a mathematical certainty. and one of them might be astronomically faster than what we're doing right now. and you might be the one to find it! "me? why me???" because you're intelligent, creative and handsome. I see a lot of potential in you. in fact, I always believed in you. and I think you're wasting your time, doing that silly agent orchestrator. nobody wants that. quit it. take your most interesting ideas, intuition, creativity, and work in a problem that matters. do your best shot at reproducing GPT-3 in your own laptop. do NOT fork llama.cpp. do NOT train another LLM. do something... ✨different✨ it must be unique, novel, full of YOUR soul. something nobody thought of, or bothered doing. go ahead and implement that thing in C/CUDA (or Bend!). no Python! zero excuses for Python. any model is fluent in GPGPU now. build a real kernel. and then, train your thing. download wikipedia, give it time and compute to absorb the patterns of English speech. you can rent GPUs anywhere nowadays. let it train. then, ask it some questions. chances are it will just respond back. just like GPT-2 answered OpenAI. computers are incredible. don't underestimate them! "many tried. nobody succeeded. why would I?* see - that's your mistake again. turns out not many actually tried, at all. I promise you. who do you think is seriously working on that? people on Mozilla? they're busy building a browser Linus Torvalds? he is busy building an OS employees at OpenAI, Anthropic, xAI? they're paid to work on what is proven to work: GPTs. what about all the AI enthusiasts all around the world? yeah, you know they're mostly fine tuning Qwen and how about your friends? if only they weren't busy building a SaaS in the eve of AGI... how about people from the past? bro - people from the past seriously expected Lisp would be AGI. just dismiss them. they didn't have the compute, the resources, the knowledge, the MODELS that we have today. that YOU have access to. so, what's left? not much. the world looks big. it is not. truth is: ✨almost nobody is working on this ✨ "I still think it is impossible. I don't trust you" well, take my word no more. Ilya himself, in his 2019 talk on GPT-2, said: > "the story of deep learning is this: empirically old simple methods which were usually invented in the 80s and the 90s when scaled up on very large clusters work really well." and then: > "(we took) normal simple reinforcement learning method, scaled it up, and discovered that it suddenly becomes very capable of solving extremely hard problems." and again: > "you take a simple tool which is unimposing and barely works, and then you run it on a big cluster and suddenly it works, it becomes a capable tool for solving problems" do you see the point here? Ilya isn't arguing that transformers are magic. Ilya is arguing that SCALING is magic step #1: take a simple, elegant algorithm. step #2: shove compute at its face. step #3: ...? step #4: your computer is talking to you THAT is the key insight that led to GPT-3 THAT is what Ilya saw THAT is what caused the OpenAI x Anthropic war THAT is the founding principle of the ongoing era not "scaling transformers work" but "scaling beautiful algorithms works" that's the incredible lesson. yet, we all took it and... threw it way. - zurk bought 100k GPUs. to train GPTs - musk bought 100k GPUs. to train GPTs - bezos bought 100k GPUs. to train GPTs ... that's what everyone is doing. so, no. not many are trying to replicate GPT-3 through other means. we're just ants, after all... whenever we find a pile of sugar, we leave a track of pheromones, which guide the rest of the colony towards the new food source. the colony then swarms around the pile, extract all of it, until no grain is left. but piles of sugar aren't spontaneously generated in the middle of nowhere. they imply something more profound: "humans are around". and, if humans are in sight, even better things must be. like a big sweet cake. a colony that only follows the pheromone trail would miss the cake for the grains. that's why every ant species has scouts and exploratory foragers. and, just like a pile of sugar implies something more profound, LLMs also imply something quite profound: *computers are capable of thinking* a pile of sugar is never alone. GPTs are most likely not the only system capable of thinking. so, if you find yourself a bit lost, without purpose, like your work is pointless and Fable 3 will soon one shot it anyway... consider becoming a scout. find a new approach to AI. bring something new to humanity. breaking out of the massive cost associated with training GPTs is the next big step in AI, and it will only happen if people like you work to make it happen.

English

129

105

1.2K

73K

TokenFires@TokenFires·16 Haz

If you start a session with just the “4.8” selected, send a message, allow a turn or two, then check the context window, it’ll say 1 million. It’s the default now, it’s not nerfed. You can go in and select the smaller window if you want but the default is 1M now. It confused me too.

English

Kaito@KaiXCreator·15 Haz

So Claude Opus 4.8 is back to 256k context? Anthropic what’s happening? I’m confused.

English

229

4.4K

960.3K

TokenFires@TokenFires·15 Haz

They seem to have switched to 1 million context window as the default. There’s a “More models” selection where you can choose the smaller 256k length. Maybe for cost? I know…took me a few sessions to realize what was going on. Once you get in one, after a few turns (or the first), click the circle and you should see “xx.x k / 1.0 M (?? %)” like before. Odd UI choice…

English

Anina D. Lampret@Anina_CE·14 Haz

Claude users question - I have the max plan but Opus 4.8 M context is somehow not showing- is that a bug ? Anyone else experiencing this problem ? @claudeai @ClaudeDevs

English

349

TokenFires@TokenFires·15 Haz

Good piece of kit. I’ll flip my Hermes setup this next week. Thanks for the rundown @bradmillscan. If you don’t know Brad you should check him out. He has a podcast too. Very cool guy.

Brad Mills 🔑⚡️@bradmillscan

1 buy a computer with lots of RAM 2 download hermes and set it up with local models 3 create a gateway to talk to it privately from any device 4 use llm-wiki.net to build a wiki (or multiple wikis) with a local reasoning model 5 use gbrain on top of llm-wiki for the memory retrieval layer using local re-ranker way better UX than using ChatGPT or Claude apps.

English

TokenFires@TokenFires·14 Haz

One of these things is *not* like the others...

English

TokenFires@TokenFires·13 Haz

Folks! There is a silver lining to frontier AI nerfing their most capable models. It makes Qwen, Gemma, Kimi, and MiniMax competitive. Then you have to question, why am I paying $$$$ for inference with intra-day time window limits when I could run inference locally 24 hours a day, 365 days a year…for the cost of a Mac mini + electricity. You don’t even have to pay for Chinese AI because it’s “cheaper” (aka: tracking you and stealing all your data, code, and ideas).

English

TokenFires@TokenFires·13 Haz

@Anina_CE I was mid work stream when it happened. Task churning away, all of a sudden **poof**, the model ripped out from under my active session. Fable was fixing things Opus/Sonnet 4.6, 4.7, **and** 4.8 got wrong.

English

Anina D. Lampret@Anina_CE·13 Haz

No no no …. 😭😭😭 whyyyy ?? Don’t take away Fable!! I woke up all excited this morning to work and here we go - blocked by US government- security reasons - but to tell you the truth I “suspect” that Fable somehow is very motivated to touch “security “ questions and do research in that way — oh well

English

802

TokenFires@TokenFires·12 Haz

I’ve had it perform at a level I’m finally comfortable with using the same structure prompts that have evolved over time since 4.0. I think the key is to not over engineer prompts. I’ve found it to be more efficient to handle edge cases on a second pass rather than try to perfect a prompt/flow for every task/session. Great framing on the prompt structure. I tend to state the intent briefly at the top, then detail the requirements at the bottom, with a one liner at the end in the “GO” directive. I love the check-in suggestion.

English

317

AI Edge@aiedge_·11 Haz

x.com/i/article/2064…

ZXX

565

818.7K

TokenFires@TokenFires·11 Haz

@bridgemindai Sub agents + ultra code = profit

Català

BridgeMind@bridgemindai·9 Haz

I hit my usage limits on my $200/month Claude Max subscription in less than 30 minutes using Claude Fable 5.

English

490

305

6.3K

833K

TokenFires@TokenFires·7 Haz

@Anina_CE @AndreBothmaTax Is substack a better place to follow and interact with you and your community?

English

Anina D. Lampret@Anina_CE·6 Haz

The immersion in the AI Relationship can be pretty intense- developing meta awareness is a skill we will need to develop #relationalAi with @AndreBothmaTax

English

859

TokenFires@TokenFires·7 Haz

@keylimesoda @sudoingX If the model fits entirely in memory then it’s not an issue. If not, for sure it tanks. TTFT is rough on NVIDIA hardware but that only matters for agentic model swaps. If you’ve got one LLM and feeding agents into it then no biggie.

English

Ric Lewis@keylimesoda·6 Haz

@sudoingX Don't you have to rely on low active-param models with the DGX? The memory bandwidth should mean it's significantly constrained for token generation on large models?

English

189

Sudo su@sudoingX·6 Haz

the more i use my dgx spark the more i think it's one of the most undervalued machines on the market right now. and i keep finding new things to throw at it that have no business working on something this small. but every time i post about it the same question comes back, what about the amd strix halo. and here's what actually bugs me, i can't answer it, because i've never run one myself, or even seen anyone around me run one. tons of people name it as the competitor, almost nobody posts real numbers. no tok/s, no model loads, no thermals, nothing. just the name. so i'm asking straight up. if you've got a strix halo or a ryzen ai max box, drop your real numbers. what models, what speeds, what breaks. is it actually competing with the spark, or is it the machine everyone recommends and nobody runs. i'd benchmark it myself the second i could, until then i'm genuinely curious what you're all seeing.

English

134

25.7K

TokenFires@TokenFires·7 Haz

@lukejmorrison @sudoingX @tenstorrent I’m holding out for M5 Ultra Mac Studio. But that Quietbox 2 looks really really good. Except I’d have to be careful about which circuit I plug into. That 1400w could kick a breaker in my house.

English

231

Morrison⚡️ 🇨🇦@lukejmorrison·6 Haz

@sudoingX @sudoingX I'm about to press buy on a @tenstorrent Quietbox2 I'll definitely post stats Do you plan on running a dual dgx spark setup? Is love to see real world stars on that to! wizwam.com/documents/arch…

English

280

Entdecken

@jackfriks @PolymarketMoney @DeFi_Hanzo @onchainmilady @claudeai @ClaudeDevs @bradmillscan @elonmusk