macintog

115 posts

macintog banner
macintog

macintog

@macintogdev

Making the stuff I wished existed.

USA شامل ہوئے Mart 2026
124 فالونگ979 فالوورز
پن کیا گیا ٹویٹ
macintog
macintog@macintogdev·
Hello. I started using Apple computers in 1982. I worked on Mac OS X for 15 years. I've used Linux for 29 years. I game on Windows (10 natch). LLMs are the most interesting technology since the internet, and will be as impactful. I'll be posting my thoughts and work here.
English
24
8
162
17.8K
macintog ری ٹویٹ کیا
Alex Prompter
Alex Prompter@alex_prompter·
Both OpenAI and Anthropic just released official prompting guides. Both say the same thing. Your old prompts don’t work anymore. But for opposite reasons. Claude Opus 4.7 stopped guessing what you meant. It does exactly what you type. Nothing more, nothing less. Vague instructions that worked on 4.6? They now produce narrow, literal, sometimes worse results. Not because the model got dumber. Because it stopped compensating for sloppy thinking. GPT-5.5 went the other direction. OpenAI’s guide literally says: “Don’t carry over instructions from older prompt stacks.” Legacy prompts over-specify the process because older models needed hand-holding. GPT-5.5 doesn’t. That extra detail now creates noise and produces mechanical output. Claude got more literal. GPT got more autonomous. Both now punish the same thing: prompts written without clear thinking behind them. One developer on Reddit captured it perfectly after analyzing hundreds of community posts. The complaints tracked almost perfectly with prompt specificity. Precise prompts got better results on 4.7. Vague prompts got worse. The model didn’t regress. The prompts did. OpenAI’s new framework is “outcome-first prompting.” Describe what good looks like. Define success criteria. Set constraints. Then get out of the way. The model picks the path. Anthropic’s framework is the inverse: be surgically specific about what you want, because the model won’t fill in your blanks anymore. Two different architectures. Two different philosophies. One identical conclusion: the person writing the prompt is now the bottleneck, not the model. Boris Cherny, the engineer who built Claude Code, posted on launch day that even he needed a few days to adjust. That post got 936 likes. Meanwhile, Anthropic increased rate limits for all subscribers because the new tokenizer uses up to 35% more tokens on the same input. The model is more expensive to run lazily. Cheaper to run precisely. The models are converging in capability. The gap between good and bad output is no longer about which model you pick. It’s about the 2 minutes of structured thinking you do before you type anything. That thinking system is the skill. The prompt is just what it produces.
Alex Prompter tweet mediaAlex Prompter tweet media
English
18
24
148
14.8K
macintog ری ٹویٹ کیا
Theo - t3.gg
Theo - t3.gg@theo·
Tanner begged NPM to take down a squatted "tanstack" package that was being held ransom against him. 48 days later, it was compromised and shipped malware. There is no excuse. NPM needs to make significant changes.
Tanner Linsley@tannerlinsley

.@SH20RAJ, we could really use the `tanstack` npm package name. We've proactively reached out via email many times in the past with no response but are now getting complaints from unsuspecting users and agents mistaking it for the official TanStack CLI. Please respond 😊

English
24
23
631
32.6K
macintog
macintog@macintogdev·
@heisei_ramen Imo everyone should start with @lmstudio. It remains useful vending your favorite models to harnesses like hermes which is genuinely useful, once you're ready.
English
0
0
2
14
Squiggles
Squiggles@heisei_ramen·
@macintogdev My mom has read too much hackernews recently and is once again considering spinning a local LLM up on the mini she uses as a NAS/workstation rn. Would you recommend Hermes? Like, is it genuinely useful?
English
2
0
3
20
macintog
macintog@macintogdev·
Prompt engineering is more important than ever. So important, in fact, that it can't be left to people.
macintog tweet media
Adam.GPT@TheRealAdamG

developers.openai.com/api/docs/guide… **NEW: GPT-5.5 Prompting Guide** "GPT-5.5 works best when prompts define the outcome and leave room for the model to choose an efficient solution path. Compared with earlier models, you can often use shorter, more outcome-oriented prompts: describe what good looks like, what constraints matter, what evidence is available, and what the final answer should contain. Avoid carrying over every instruction from an older prompt stack. Legacy prompts often over-specify the process because earlier models needed more help staying on track. With GPT-5.5, that can add noise, narrow the model’s search space, or lead to overly mechanical answers. For more detail on GPT-5.5 behavior changes, start with the Using GPT-5.5 guide. This guide focuses on prompt changes that follow from those behavior changes. The patterns here are starting points. Adapt them to your product surface, tools, evals, and user experience goals."

English
1
0
12
426
macintog
macintog@macintogdev·
@alexellisuk I strategically cuss at codex when necessary, so I can go back later, ask it to review memory for the worst pain points, and come up with solutions to prevent those scenarios. They're also doing sentiment analysis and it can be a powerful steering signal.
English
0
0
12
175
Alex Ellis
Alex Ellis@alexellisuk·
Everyone: Claude got dumb Me: Nah, I've never had that. This week I've lost count of the "What the F are you doing?! I did not ask you to edit any code" You think you'll never swear at an LLM like a degenerate. Just wait until you see it wrecking a codebase on auto mode.
English
17
2
41
4.9K
macintog
macintog@macintogdev·
@b3kdcorbin It's certainly worth playing with. They've done a lot of thoughtful work that comes together well.
English
0
0
2
18
macintog
macintog@macintogdev·
If the frontier LLM vendors want to leverage their capex a bit better, how about offering a -lazy or -opportunistic thinking level that could be used by harnesess like hermes or openclaw to run pending jobs when infrastrufture is underutilized. Plenty of work can be done whenever
English
1
1
14
277
macintog
macintog@macintogdev·
Do not give @mercury any of your information. After harvesting my most sensitive personal information, they instantly said "oops you can't use us," refuse to give even the vaguest reason why, and clearly don't care at all how badly they are doing everything.
macintog tweet media
English
3
13
162
4.9K
Mercury
Mercury@mercury·
@macintogdev Hi there, thank you for reaching out! We understand that this was a challenging experience. Please send us a DM with more details, as well as the email associated with your Mercury application, and we’d be happy to review further.
English
1
0
1
75
macintog ری ٹویٹ کیا
Chris
Chris@chatgpt21·
I literally just watched GPT-5.5 via codex beat an Amazon customer associate in real time. 💀 I asked it to get me a refund, and I watched it navigate the settings, cancel the subscription, then it went step further into the help page. I thought it was going to request a phone call (which would prompt me to take over) Instead, it opened: “Chat with an associate now.” That’s when I sat up on my couch because I knew it was going to get real The agent said: “Your subscription is active.” And GPT-5.5 immediately explained that it only shows as active because cancellation leaves access through the billing period, but that I wanted it stopped now and refunded. And my jaw just hung open, it was the first time I watched sand handle a customer service agent for me in real time Once the agent confirmed the refund, it just ended the chat no mercy no thank you LMAO First time I’ve watched a human customer service agent get outmaneuvered by AI in real time. And it made me 15$! almost paid for itself in 5 minutes
English
168
207
4.5K
898.7K
macintog
macintog@macintogdev·
@BowTiedStack n.b. this is for plumbing it to codex direct, but you can point your harness at it to learn and adapt.
English
1
0
2
25
macintog
macintog@macintogdev·
thanks to QMD integration, my codex setup has perfect recall of all user & assistant messages going back a couple months. zero hits for any of the below gpt-5.5 creature leaks. local memory is /the/ unlock for the next level of LLM use
macintog tweet mediamacintog tweet media
arb8020@arb8020

gpt-5.5 prompt for codex seems to have a duplicated line trying to get it to not talk about creatures? Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query. [...] Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query gh link: #L55" target="_blank" rel="nofollow noopener">github.com/openai/codex/b…

English
1
0
14
539
BowTied Fullstack - Link in bio or NGMI
@macintogdev Local memory in Hermes’ was the first big unlock for me, improved subsequent conversation results slowly but significantly over time. Curious how involved your QMD integration is, seems like it’d be similarly powerful.
English
1
0
1
45
macintog
macintog@macintogdev·
@BowTiedStack first university paper I turned in came back with the TA note "this has something like a main point," and my ears are still ringing.
English
0
0
3
179
macintog
macintog@macintogdev·
tfw you iterate for days, and then the latest pass from gpt pro (which has been architecting all along) calls the current state "salvageable"
English
1
0
12
248
Lee Penkman
Lee Penkman@LeeLeepenkman·
yea just query directly into a structured outputs prompt thats like high low med etc will work. but even that is overkill... so what i do in openpaths.io (also open source) is i use static embedding models so eg I have a few pre-computed texts that I know are like easy low reasoning things like "resolve small merge conflicts" "commit and push" and I have some strings for questions that I know are needing extra high reasoning like "build a trading bot" "build a 3D editing app" "make an Aurora shader" and then post text. If it maps more closely to those, then it will be a high reasoning. So I can do all this in one millisecond there, the autothink routing.
English
1
0
2
18
Lee Penkman
Lee Penkman@LeeLeepenkman·
gpt 5.5 can easily decide the appropriate level of thinking for a task if you just ask it for the level of thinking lol. can save u lots of tokens... i find this kind of introspective ability kind of fascinating. It has like a very good knowledge of its own limitations. Kind of hard to train this kind of thing in.
English
1
0
3
152