macintog

115 posts

macintog

@macintogdev

Making the stuff I wished existed.

USA شامل ہوئے Mart 2026

124 فالونگ979 فالوورز

پن کیا گیا ٹویٹ

macintog@macintogdev·4d

Hello. I started using Apple computers in 1982. I worked on Mac OS X for 15 years. I've used Linux for 29 years. I game on Windows (10 natch). LLMs are the most interesting technology since the internet, and will be as impactful. I'll be posting my thoughts and work here.

English

162

17.8K

macintog ری ٹویٹ کیا

Alex Prompter@alex_prompter·2h

Both OpenAI and Anthropic just released official prompting guides. Both say the same thing. Your old prompts don’t work anymore. But for opposite reasons. Claude Opus 4.7 stopped guessing what you meant. It does exactly what you type. Nothing more, nothing less. Vague instructions that worked on 4.6? They now produce narrow, literal, sometimes worse results. Not because the model got dumber. Because it stopped compensating for sloppy thinking. GPT-5.5 went the other direction. OpenAI’s guide literally says: “Don’t carry over instructions from older prompt stacks.” Legacy prompts over-specify the process because older models needed hand-holding. GPT-5.5 doesn’t. That extra detail now creates noise and produces mechanical output. Claude got more literal. GPT got more autonomous. Both now punish the same thing: prompts written without clear thinking behind them. One developer on Reddit captured it perfectly after analyzing hundreds of community posts. The complaints tracked almost perfectly with prompt specificity. Precise prompts got better results on 4.7. Vague prompts got worse. The model didn’t regress. The prompts did. OpenAI’s new framework is “outcome-first prompting.” Describe what good looks like. Define success criteria. Set constraints. Then get out of the way. The model picks the path. Anthropic’s framework is the inverse: be surgically specific about what you want, because the model won’t fill in your blanks anymore. Two different architectures. Two different philosophies. One identical conclusion: the person writing the prompt is now the bottleneck, not the model. Boris Cherny, the engineer who built Claude Code, posted on launch day that even he needed a few days to adjust. That post got 936 likes. Meanwhile, Anthropic increased rate limits for all subscribers because the new tokenizer uses up to 35% more tokens on the same input. The model is more expensive to run lazily. Cheaper to run precisely. The models are converging in capability. The gap between good and bad output is no longer about which model you pick. It’s about the 2 minutes of structured thinking you do before you type anything. That thinking system is the skill. The prompt is just what it produces.

English

148

14.8K

macintog ری ٹویٹ کیا

Theo - t3.gg@theo·1h

Tanner begged NPM to take down a squatted "tanstack" package that was being held ransom against him. 48 days later, it was compromised and shipped malware. There is no excuse. NPM needs to make significant changes.

Tanner Linsley@tannerlinsley

.@SH20RAJ, we could really use the `tanstack` npm package name. We've proactively reached out via email many times in the past with no response but are now getting complaints from unsuspecting users and agents mistaking it for the official TanStack CLI. Please respond 😊

English

631

32.6K

macintog@macintogdev·2h

@heisei_ramen Imo everyone should start with @lmstudio. It remains useful vending your favorite models to harnesses like hermes which is genuinely useful, once you're ready.

English

Squiggles@heisei_ramen·2h

@macintogdev My mom has read too much hackernews recently and is once again considering spinning a local LLM up on the mini she uses as a NAS/workstation rn. Would you recommend Hermes? Like, is it genuinely useful?

English

macintog@macintogdev·7h

wow, been busier than I thought

Sharbel@sharbel

x.com/i/article/2049…

English

391

macintog@macintogdev·3h

Prompt engineering is more important than ever. So important, in fact, that it can't be left to people.

Adam.GPT@TheRealAdamG

developers.openai.com/api/docs/guide… **NEW: GPT-5.5 Prompting Guide** "GPT-5.5 works best when prompts define the outcome and leave room for the model to choose an efficient solution path. Compared with earlier models, you can often use shorter, more outcome-oriented prompts: describe what good looks like, what constraints matter, what evidence is available, and what the final answer should contain. Avoid carrying over every instruction from an older prompt stack. Legacy prompts often over-specify the process because earlier models needed more help staying on track. With GPT-5.5, that can add noise, narrow the model’s search space, or lead to overly mechanical answers. For more detail on GPT-5.5 behavior changes, start with the Using GPT-5.5 guide. This guide focuses on prompt changes that follow from those behavior changes. The patterns here are starting points. Adapt them to your product surface, tools, evals, and user experience goals."

English

426

macintog@macintogdev·7h

@alexellisuk I strategically cuss at codex when necessary, so I can go back later, ask it to review memory for the worst pain points, and come up with solutions to prevent those scenarios. They're also doing sentiment analysis and it can be a powerful steering signal.

English

175

Alex Ellis@alexellisuk·9h

Everyone: Claude got dumb Me: Nah, I've never had that. This week I've lost count of the "What the F are you doing?! I did not ask you to edit any code" You think you'll never swear at an LLM like a degenerate. Just wait until you see it wrecking a codebase on auto mode.

English

4.9K

macintog@macintogdev·7h

@b3kdcorbin It's certainly worth playing with. They've done a lot of thoughtful work that comes together well.

English

b3k D'Corbin@b3kdcorbin·20h

@macintogdev Should I be using hermes? Is it worth getting into?

English

macintog@macintogdev·21h

If the frontier LLM vendors want to leverage their capex a bit better, how about offering a -lazy or -opportunistic thinking level that could be used by harnesess like hermes or openclaw to run pending jobs when infrastrufture is underutilized. Plenty of work can be done whenever

English

277

macintog@macintogdev·8h

Do not give @mercury any of your information. After harvesting my most sensitive personal information, they instantly said "oops you can't use us," refuse to give even the vaguest reason why, and clearly don't care at all how badly they are doing everything.

English

162

4.9K

macintog@macintogdev·9h

@mercury Thank you. DM'd.

English

Mercury@mercury·9h

@macintogdev Hi there, thank you for reaching out! We understand that this was a challenging experience. Please send us a DM with more details, as well as the email associated with your Mercury application, and we’d be happy to review further.

English

Mercury@mercury·1d

This is the reaction we built it for. Drop your wishlist, we're listening.

Theo - t3.gg@theo

Mercury is the single “product” that has improved my life the most. I have always hated banks, invoicing platforms, and corporate cards. They make all of these things so incredibly easy. I haven’t had to open QuickBooks in over a year now. They do not pay me. I am genuinely just this hyped on them. The only catch is that you’ll hate every bank you use after trying Mercury. If you need a “bank” for your business, I can’t recommend Mercury highly enough.

English

366

40.3K

macintog ری ٹویٹ کیا

Chris@chatgpt21·19h

I literally just watched GPT-5.5 via codex beat an Amazon customer associate in real time. 💀 I asked it to get me a refund, and I watched it navigate the settings, cancel the subscription, then it went step further into the help page. I thought it was going to request a phone call (which would prompt me to take over) Instead, it opened: “Chat with an associate now.” That’s when I sat up on my couch because I knew it was going to get real The agent said: “Your subscription is active.” And GPT-5.5 immediately explained that it only shows as active because cancellation leaves access through the billing period, but that I wanted it stopped now and refunded. And my jaw just hung open, it was the first time I watched sand handle a customer service agent for me in real time Once the agent confirmed the refund, it just ended the chat no mercy no thank you LMAO First time I’ve watched a human customer service agent get outmaneuvered by AI in real time. And it made me 15$! almost paid for itself in 5 minutes

English

168

207

4.5K

898.7K

macintog ری ٹویٹ کیا

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·21h

{GOBLIN_MODE: ENABLED}

English

926

31.8K

macintog@macintogdev·21h

@BowTiedStack n.b. this is for plumbing it to codex direct, but you can point your harness at it to learn and adapt.

English

BowTied Fullstack - Link in bio or NGMI@BowTiedStack·21h

@macintogdev I know what my plans tonight are now… thanks!

English

macintog@macintogdev·21h

thanks to QMD integration, my codex setup has perfect recall of all user & assistant messages going back a couple months. zero hits for any of the below gpt-5.5 creature leaks. local memory is /the/ unlock for the next level of LLM use

arb8020@arb8020

gpt-5.5 prompt for codex seems to have a duplicated line trying to get it to not talk about creatures? Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query. [...] Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query gh link: #L55" target="_blank" rel="nofollow noopener">github.com/openai/codex/b…

English

539

macintog@macintogdev·21h

@BowTiedStack a) very simple b) way better github.com/macintog/codex…

English

BowTied Fullstack - Link in bio or NGMI@BowTiedStack·21h

@macintogdev Local memory in Hermes’ was the first big unlock for me, improved subsequent conversation results slowly but significantly over time. Curious how involved your QMD integration is, seems like it’d be similarly powerful.

English

macintog@macintogdev·21h

@BowTiedStack first university paper I turned in came back with the TA note "this has something like a main point," and my ears are still ringing.

English

179

BowTied Fullstack - Link in bio or NGMI@BowTiedStack·21h

@macintogdev More vicious a self-critic I’ve yet to meet.

English

macintog@macintogdev·22h

tfw you iterate for days, and then the latest pass from gpt pro (which has been architecting all along) calls the current state "salvageable"

English

248

macintog@macintogdev·21h

@LeeLeepenkman tyvm

Svenska

Lee Penkman@LeeLeepenkman·21h

yea just query directly into a structured outputs prompt thats like high low med etc will work. but even that is overkill... so what i do in openpaths.io (also open source) is i use static embedding models so eg I have a few pre-computed texts that I know are like easy low reasoning things like "resolve small merge conflicts" "commit and push" and I have some strings for questions that I know are needing extra high reasoning like "build a trading bot" "build a 3D editing app" "make an Aurora shader" and then post text. If it maps more closely to those, then it will be a high reasoning. So I can do all this in one millisecond there, the autothink routing.

English

Lee Penkman@LeeLeepenkman·1d

gpt 5.5 can easily decide the appropriate level of thinking for a task if you just ask it for the level of thinking lol. can save u lots of tokens... i find this kind of introspective ability kind of fascinating. It has like a very good knowledge of its own limitations. Kind of hard to train this kind of thing in.

English

152

دریافت کریں

@heisei_ramen @lmstudio @alexellisuk @b3kdcorbin @mercury @BowTiedStack @elonmusk @BarackObama