Quinn Slack

11.8K posts

Quinn Slack banner
Quinn Slack

Quinn Slack

@sqs

CEO & Member of Technical Staff @ampcode, founded @sourcegraph

San Francisco Katılım Şubat 2007
3.9K Takip Edilen15.8K Takipçiler
Don Montegarza
Don Montegarza@aiDidThisToo·
@ryancarson @thorstenball hi, i wonder what happened on the @AmpCode Build Crew Live. regular podcast on youtube? its my usual go to for advancement of the coding model. its seems dormant. any chance bring it back?
English
1
0
0
4
Quinn Slack
Quinn Slack@sqs·
@jetpackjoe_ @AmpCode Mostly deep lately. At this point we need to rename the modes like smart->interactive. Smart is mostly for interactive tasks for me now.
English
1
0
2
60
Quinn Slack
Quinn Slack@sqs·
Smart mode in Amp now lets you use Opus 4.6 wth 300k input tokens of context, up from 168k. Also, tokens above 200k are now charged at the same (not higher) per-token rate. Why not 1M tokens in smart? Quality/cost aren't right. If truly needed, use (hidden) large mode.
Quinn Slack tweet media
English
8
6
75
3.5K
Quinn Slack
Quinn Slack@sqs·
@pcstyle53 It’s the fundamental architecture of the model. I do think that providers could further lower their cached token prices. Supposedly Claude Max plans don’t even count cached tokens against your quotas. If true, then that indicates there is more room to discount.
English
1
0
0
26
Quinn Slack
Quinn Slack@sqs·
@pcstyle53 What do you mean? That is how all agents work today. The model gets the entire history each time. You can truncate tool results as you go, but that has a big impact on quality and caching.
English
1
0
0
32
pcstyle
pcstyle@pcstyle53·
@sqs each exchange**
English
1
0
0
32
Quinn Slack
Quinn Slack@sqs·
@pcstyle53 Yeah. The big driver is that long threads are exponential in cost though.
English
1
0
0
121
pcstyle
pcstyle@pcstyle53·
@sqs oooh large uses opus too? thats why it burns trough credits this fast :)
English
1
0
0
138
Quinn Slack
Quinn Slack@sqs·
Yes, I’ve been seeing the circular thinking from Opus 4.6 more often now and we’re trying to figure out what causes it. It seems much more common among our team than among users, and you’re the first to mention it outside our team. Working on it! IME it isn’t a function of context length, I don’t really know what triggers it yet.
English
1
0
2
65
Palani — oss/acc
Palani — oss/acc@Palanikannan_M·
It depends But multiple threads even for a single task is not a uncommon thing, and I definitely handoff/steer at 50-65 percent Since after like 120k tokens or so I often see this behaviour where it starts going in circles I really am not a fan of the part where in smart mode it starts going in circles (but wait .... But waitt.....oh I think I got it.... Oh no wait...) part A good part about amp is threads, so it can easily refer to stuff without much trouble so shorter threads didn't bother me But would love to know what's the new recommendation from your internal testing, since all your advices have been bangers in my workflow!
English
1
0
1
58
Quinn Slack
Quinn Slack@sqs·
@Palanikannan_M Are you using a single thread for multiple (logical) tasks? Or same task? And you are seeing this in smart mode on a recent-ish (last 45 days) version of Amp, meaning it's on Opus 4.6?
English
1
0
1
66
Palani — oss/acc
Palani — oss/acc@Palanikannan_M·
@sqs Ohhh I see! I even limited it to less than 168k keeping things at roughly 65% context usage per thread, after that point I felt the model starts to not remember a lot of stuff from the initial context until I chime in and ask it to
English
1
0
1
72
Quinn Slack
Quinn Slack@sqs·
@Palanikannan_M You should feel good going up to 300k in a thread if the task demands it. And you should feel good doing slightly bigger tasks in a single thread now. No need to limit yourself to 168k anymore.
English
1
0
5
219
Palani — oss/acc
Palani — oss/acc@Palanikannan_M·
@sqs But is 168k/2 ish tokens still the recommended for best performance in each thread, what do you folks recommend now from your internal usage?
English
1
0
1
237
max
max@smax253·
@AmpCode saw yall raised the smart agent context window size thanks friends
English
0
0
2
26
SD
SD@sPredictorX1807·
@sqs @p0 Is p0 directly embedded in apm or you use it as a mcp?
English
1
0
0
41
Quinn Slack
Quinn Slack@sqs·
Just got a new SteelSeries mouse. Their app wasn't working and was a 250MB Electron behemoth. So Amp wrote a program for me, using the librarian and @p0 web search and some Swift code, to turn off the distracting LED light for me. No "software" needed.
Quinn Slack tweet media
English
10
0
83
5K
Quinn Slack
Quinn Slack@sqs·
@pcstyle53 @p0 You’ll stay in insiders but the 2x isn’t extensible. Maybe we will do another bonus in the future.
English
0
0
0
32
pcstyle
pcstyle@pcstyle53·
@sqs @p0 count me in! also is there a place where i can apply for extending my Insiders duration? :D
pcstyle tweet media
English
1
0
0
30
Quinn Slack
Quinn Slack@sqs·
Most LLM spend is usage-based. Subscriptions are mostly gross-margin positive but the labs price-discriminate so they’re not offered to the richest part of the market (enterprises). If anything, subs train people to use wild techniques on their hobby projects then they bring those to work and their enterprise pays full price for them at scale.
English
0
0
0
362
Peter Choi
Peter Choi@pitachoi·
@sqs This is why usage based pricing creates such weird dynamics
English
1
0
0
421
Quinn Slack
Quinn Slack@sqs·
An uncomfortable truth about building agents/models: By default, your most lucrative, most-smitten customers will be those using intricate out-of-band techniques that are exorbitantly expensive and probably net negative (but that they love). It's a very weird incentive. You can't and don't want to indulge this. There's nothing wrong with experimentation, but if you saw what every agent company sees, you'd know this goes way beyond experimentation. Amp tries really hard to prevent this: limiting long context, showing prices, not recommending swarms or loops prematurely, strongly advising against big MCPs, killing features that have high usage but that aren't worth it anymore, and just generally staying away from any hype train we don't have a good gut feeling about. Pi and OpenCode are also particularly good and outspoken here. But if you have growth targets to hit, investors to pitch, and salespeople to keep happy, or if you didn't start this way from day 1, I can see it being tricky. At Amp, we're profitable, don't have salespeople, and have no sales/growth targets to hit, so we have it relatively easy. I often wonder what this tension is like inside other companies building agents. (And for the record: if you've shown me your Amp workflow and I haven't told you this directly, this post is not about you. :)
Thorsten Ball@thorstenball

Lately, whenever I open this app and see the latest tricks, and hacks, and notes, and workflows, and spec here and skill there, I can't help but think: All of this will be washed away by the models. Every Markdown file that's precious to you right now will be gone.

English
10
13
194
56K
aislop
aislop@lazysloth·
@sqs @p0 would have just returned it, hate peripherals with mandatory software but put little resources into it, hope there isn't a firmware update too
English
1
0
0
60
pcstyle
pcstyle@pcstyle53·
@sqs @p0 wait internal mode ;O quinn dont u need beta testers??
English
1
0
0
59
stephen 🌿
stephen 🌿@stevelizcano·
@thorstenball One thing i wish we had in amp was the ability to turn off and on mcps Right now its just install or remove
English
1
0
0
608
Thorsten Ball
Thorsten Ball@thorstenball·
Back when we had the 1m context window in Amp, we noticed our numbers go up MASSIVELY. There was a small number of users who did *everything* (think: multiple projects!) in the same thread over multiple days (cache is gone). Crazy costs. We reached out and told them about it.
Quinn Slack@sqs

An uncomfortable truth about building agents/models: By default, your most lucrative, most-smitten customers will be those using intricate out-of-band techniques that are exorbitantly expensive and probably net negative (but that they love). It's a very weird incentive. You can't and don't want to indulge this. There's nothing wrong with experimentation, but if you saw what every agent company sees, you'd know this goes way beyond experimentation. Amp tries really hard to prevent this: limiting long context, showing prices, not recommending swarms or loops prematurely, strongly advising against big MCPs, killing features that have high usage but that aren't worth it anymore, and just generally staying away from any hype train we don't have a good gut feeling about. Pi and OpenCode are also particularly good and outspoken here. But if you have growth targets to hit, investors to pitch, and salespeople to keep happy, or if you didn't start this way from day 1, I can see it being tricky. At Amp, we're profitable, don't have salespeople, and have no sales/growth targets to hit, so we have it relatively easy. I often wonder what this tension is like inside other companies building agents. (And for the record: if you've shown me your Amp workflow and I haven't told you this directly, this post is not about you. :)

English
5
2
127
38K
Quinn Slack
Quinn Slack@sqs·
@JebsSteve0x1 @S1lv3rd3m0n Enterprise workspaces pay us a markup, plus we're doing some other interesting stuff with some pretty bold companies (which also helps us learn). Ads are lucrative, but not as lucrative compared to the other stuff.
English
0
0
5
288
Stove Jebs
Stove Jebs@JebsSteve0x1·
@sqs @S1lv3rd3m0n If you don't mind me asking, ads don't seem profitable nor selling tokens at cost. How are you guys profitable? I'm always worried about this since I like amp very much.
English
1
0
2
377