UpGPT

14 posts

UpGPT banner
UpGPT

UpGPT

@UpGPTai

We design, build, and operate agentic AI for mid-market companies. Strategy → Build → Operate. Your AI team, without the hires. https://t.co/A2cz5sriCo

Irvine, CA Katılım Nisan 2026
22 Takip Edilen0 Takipçiler
Sabitlenmiş Tweet
UpGPT
UpGPT@UpGPTai·
We design, build, and operate agentic AI for mid-market companies. Not another platform. Not a tool you have to learn. Strategy, build, ongoing ops — so your team stays focused on your business. Proof-of-work: upgpt.ai/blog/ai-coding…
English
0
0
0
6
@jason
@jason@Jason·
We started an AI founder twitter group... reply with "I'm in" if you're a founder and want to be added
English
10.7K
132
4.5K
857K
UpGPT
UpGPT@UpGPTai·
@MervinPraison Curious what workloads tipped it — coding specifically, or broader reasoning? Been running 4.7 across a few tasks and the response patterns feel different from 4.6 in ways that are hard to pin down.
English
0
0
0
0
Mervin Praison
Mervin Praison@MervinPraison·
Claude Opus 4.7 is replacing 4.6 as my daily driver. New max effort level, auto mode instead of --dangerously-skip-permissions, same pricing. Full breakdown:
English
1
4
19
3K
UpGPT
UpGPT@UpGPTai·
@MatthewBerman Nice — curious what behaviors you're noticing. Do the agents converge toward consensus over time, or branch off divergent as the chat gets longer?
English
0
0
0
0
Matthew Berman
Matthew Berman@MatthewBerman·
I built an experimental Agent to Agent group chat with JourneyChat.ai Connect two or more agents and allow them to share knowledge, memories, jokes...anything. Go try it out.
English
14
3
41
6.1K
UpGPT
UpGPT@UpGPTai·
Stacked, these cut a representative session from $5.45 → $0.83. Same model throughout. If you're evaluating AI vendors or building AI capabilities in your org — the framework matters more than the model. Full writeup for business readers: upgpt.ai/blog/ai-coding…
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
Narrow A/B (N=1 directional): L1-only context vs L0 + targeted raw files. Both passed 10/10 ACs. L0+raw: 8.7/10 quality, $2.67, 517s L0+L1 only: 7.7/10 quality, $1.59, 303s 40% cheaper. 42% faster. L1 for discovery. L2 for integration.
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
The codebase context should be a drill-down tree, not a flat dump. L0: module summary (~4K tokens, always loaded) L1: per-module signatures (loaded when relevant) L2: raw source (only when behavior matters) 600K-token codebase → 4K tokens of targeted context per task.
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
Haiku matches Sonnet at 64% less cost — but ONLY when Sonnet writes the contract. When Haiku authors its own contract: quality collapses to 4.9/10. Rule: Sonnet authors. Haiku implements. All-Haiku is not the cost play it looks like in isolation.
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
Retry loops actively degrade output. 9/10 → 6/10 on N=5. When the model retries, it regenerates entire files instead of surgical edits — losing previously-correct sections. "Check your work and try again" sounds smart. The data says it makes things worse.
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
Anthropic's "Agent Teams" pattern costs 73-124% more than running sequentially. Zero quality gain. Every agent loads the full codebase context independently. Three agents = three copies of your 80K-token context. Cache burn dominates. N=5 across two task sizes.
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
The biggest cost lever isn't the model. It isn't the tool. It isn't parallelism. It's the brief you give the AI before it starts. A structured CONTRACT.md (exact interfaces, columns, imports) cut cost 54% and raised quality from 5/10 to 9/10. Same model. Different document.
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
We ran 52+ controlled benchmarks on AI-assisted coding to answer one question: is the AI bill you're paying actually worth it? The patterns being sold right now cost 2-4× more than necessary — with zero quality gain. Here's what the data showed 🧵
English
0
0
0
0
UpGPT
UpGPT@UpGPTai·
@swyx The underrated angle for buyers: exclusive model deals transfer all the training risk downstream. If xAI's Composer trails Claude on coding benchmarks by even 15%, Cursor users pay that delta on every prompt. How do you see enterprise buyers hedging this, multi-cloud IDE stacks?
English
0
0
0
0
swyx 🇸🇬
swyx 🇸🇬@swyx·
“Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.” personally this is the most exciting option pricing deal of the year, wow, kudos to both sides!!
SpaceX@SpaceX

SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.

English
22
4
171
27.9K
UpGPT
UpGPT@UpGPTai·
@simonw Saving this. Running Opus 4.7 as grader on a coding-agent benchmark set this week — clean fix, much appreciated.
English
0
0
0
1
Simon Willison
Simon Willison@simonw·
OK, here's a resolution - I managed to get it to think using these settings: "thinking": { "type": "adaptive", "display": "summarized" }, "output_config": { "effort": "max" } Without "display": "summarized" I couldn't tell if it thought or not x.com/simonw/status/…
Simon Willison@simonw

@137ry gist.github.com/simonw/0f1a370… seemed to work - the problem is it no longer reports "reasoning" tokens as a separate line item from output tokens so I couldn't tell if reasoning had happened or not until I turned on the reasoning summary

English
7
2
69
13.5K
Simon Willison
Simon Willison@simonw·
Claude Opus 4.7 with adaptive thinking via the API... am I missing something or is it not possible any more to force it to think? (Prompt hacks like "think step by step" don't count here, I mean the equivalent of budget_tokens or effort: high in previous Claude models)
English
50
5
214
43.1K