erik@try.works banner
erik@try.works

@trydotworks

Building @Pocketmodelnet

เข้าร่วม Mart 2026
117 กำลังติดตาม3 ผู้ติดตาม
erik@try.works
[email protected]@trydotworks·
@blocmates @grok how much of the work in microfish is original and how much is just built on existing capabilities of OASIS
English
1
0
0
3
Chetaslua
Chetaslua@chetaslua·
Holy Shit This is insane result 🤯 I used opus 4.6 as a reviewer/planner and 4 worker minimax M2.7 agent and this the result Voxel art of eiffel tower < loop set to 5 that means opus will tell @MiniMax_AI to make it better 5 times >
English
19
36
546
62.1K
erik@try.works
[email protected]@trydotworks·
@xiongchun007 I once tried to subscribe to the Qwen 3 Coder Plus API on Alicloud but was so confused and left. Got a Kimi subscription instead.
English
0
0
1
7
程序员老熊
程序员老熊@xiongchun007·
成功订阅 DeepSeek。说说我为什么打算订阅的是千问,而最终下单的是 deepseek。 原因很简单,尼玛的阿里百炼控制台,乱七八糟一堆模型、乱七八糟的计费模式、乱七八糟的排版布局、乱七八糟的菜单设置、乱七八糟的页面、乱七八糟的图片,总之一切都是乱七八糟。 所以,打开 DeepSeek 看了一下,擦。这就是我要的,简单清爽、一目了然。就是你了,拿下!
程序员老熊 tweet media
中文
152
10
364
92.3K
erik@try.works
[email protected]@trydotworks·
@JJEnglert @tenex_labs @AnthropicAI Looks like AI theatre. You should have an implementation plan when using sub-agents, so both images are the same. The main agent need to integrate the work in the end anyway, so the only difference is minor.
English
0
0
1
3
JJ Englert
JJ Englert@JJEnglert·
Our engineers at @tenex_labs are slamming this new Claude Code feature. Here's how it works: @AnthropicAI just shipped Agent Teams — and it's a big deal even if you're not writing code yourself. Here's the simple version: Instead of one AI working on your problem alone, you can now spin up a whole team of AI agents that work together. One acts as the lead. The others are teammates. They each focus on a different piece of the work, talk to each other directly, and coordinate through a shared task list. Think of it like hiring a project manager who breaks the work into pieces, assigns it to specialists, and makes sure nothing falls through the cracks. Except all of those people are Claude, and they spin up in seconds. Why this matters even if you're not a developer If you've ever used Claude Code to research a problem, write a report, or automate a workflow — you've been working with one brain at a time. That brain has a limit on how much it can hold in its head before things start slipping. Agent Teams removes that bottleneck. Each teammate gets its own full memory. One can be deep in your financials while another is reviewing your competitor landscape while a third is drafting recommendations. They don't confuse each other's work because they literally can't see each other's context. And the best part — they talk to each other. The lead doesn't have to relay everything. Teammates share findings, challenge each other's conclusions, and build on each other's work directly. "Wait, didn't Claude Code already have subagents?" Yes. And this is where most people get confused. Here's the difference: Subagents are like sending an intern to go research something and come back with an answer. They do the work, hand you a summary, and they're done. They never talk to each other. You manage everything. Simple, cheap, good for focused tasks where you just need a result. Agent Teams are like assembling a working group. The teammates coordinate with each other, not just with you. They claim tasks from a shared list, message each other when they find something relevant, and the lead synthesizes everything at the end. More expensive, but the output is fundamentally different. When to use which: - Need a quick answer or a focused task done? Subagent. Fast, cheap, gets the job done. - Need multiple people looking at different angles of the same problem? Agent Team. The coordination is the point. - One person can do it without talking to anyone else? Subagent. The work benefits from debate, cross-checking, or parallel exploration? Agent Team. What our team is using it for right now: 1. Parallel code reviews — 3 teammates reviewing the same PR simultaneously. One on security, one on performance, one on test coverage. A single reviewer gravitates toward one issue type. Three specialists catch everything. 2. Competing hypotheses — 5 agents investigating the same bug, each with a different theory, actively trying to disprove each other. The theory that survives is almost always the root cause. 3. Cross-layer features — Frontend, backend, and tests each owned by a different teammate. No one steps on anyone else's work. Quick start if you want to try it: Requires Claude Code v2.1.32 or later. Add one environment variable to your settings.json: "CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1" Tell Claude what kind of team you want in plain English. "Create a team with 3 teammates to review this project from different angles." Claude proposes the team, you confirm, and it handles the rest — spawning teammates, assigning tasks, coordinating work. Start with 3 teammates. Keep tasks independent. Don't let two teammates touch the same files. Still experimental. But this is the first multi-agent architecture I've seen actually hold up on real dev work. Link to docs in comments below.
JJ Englert tweet media
English
14
7
122
11.2K
Ian Mabie
Ian Mabie@callmemabie·
@_catwu @RobertJBye How does the team think about balancing product velocity with the ability for consumers to figure out how to integrate what’s new into their daily lives? Feels like that becomes the limiting constraint if the team’s taste is on point and building isn’t the blocker?
English
2
0
1
4.7K
cat
cat@_catwu·
The PM playbook was built on an assumption that the technology underneath your product is roughly stable With the current pace of model progress, this is no longer true. Here's how we've evolved the PM role:
English
52
147
1.7K
251.2K
erik@try.works
[email protected]@trydotworks·
@cursor_ai could have played this as "we know how to turn open models into SOTA in 2 months". Instead it now looks like @Kimi_Moonshot is the winner, "our models are performant enough that 2 months of PT achieves SOTA'. All due to the fact that Cursor tried to obscure the model.
English
0
0
0
1
erik@try.works
[email protected]@trydotworks·
Wait until people realize that model performance is constrained by human capacity to post-train and fine tune per use case.
English
0
0
0
4
erik@try.works
[email protected]@trydotworks·
@ivanburazin @steipete what would the use case for openclaw in daytona look like? i did some quick calculations and it seems keeping it running 24/7 would cost $200 per month.
English
0
0
0
13
Sanjay
Sanjay@sanjaycodee·
@kimmonismus the disillusionment narrative feels overplayed. having been there, most of the time these "failures" are just the brutal reality of integrating a moonshot team into a shipping product org.
English
1
0
0
159
Chubby♨️
Chubby♨️@kimmonismus·
Mustafa Suleyman and his team were hired by Microsoft for nearly $700 million to further develop Copilot for the future of AI. After two years, disillusionment set in, and Satya Nadella became increasingly dissatisfied. Alongside Meta, Microsoft remains arguably the biggest laggard among companies, despite its multi-billion dollar investments.
Chubby♨️ tweet media
Pedro Domingos@pmddomingos

The inevitable has happened: Copilot no longer reports to Mustafa Suleyman. theinformation.com/briefings/micr…

English
50
31
644
103.9K
erik@try.works
[email protected]@trydotworks·
@theo You absolutely should present that as supporting evidence and see what happens. Must.
English
0
0
0
2
Theo - t3.gg
Theo - t3.gg@theo·
According to Opus 4.6, T3 Code is compliant with the Anthropic TOS. This should hold up in court right?
Theo - t3.gg tweet media
English
88
13
1.4K
96.9K
erik@try.works
[email protected]@trydotworks·
@elithrar Same. Kimi is my backup when Codex is out of quota. Does small and medium fine. For large I have to run double audits to fix all the implementation gaps. K3 should be a big step up, hopefully about GPT 5.2-ish.
English
0
0
0
2
Gail Weiner
Gail Weiner@gailcweiner·
@ns123abc Hasn’t been a good run for her at OpenAI has it ? 😏
English
2
0
11
2K
Mati 💻
Mati 💻@buildwithmati·
Being using Composer 2 for the whole day and I’m surprised that I didn’t miss Opus 4.6 at all. Incredible cheaper and feels way faster too. Will still testing it for some days and different projects/stacks but looks really promising. Do I have a new favorite model? 🤔 Have you tried it out?
English
4
1
23
2K
erik@try.works
[email protected]@trydotworks·
@trishaepan there's no deal. they're just spinning this in a positive way in the short term. next model will have different licensing and I promise you the BD team is spinning up a new project with licensing, post-training support etc
English
0
0
0
11
trisha pan
trisha pan@trishaepan·
no wonder the 3 Moonshot employees (Moonshot made kimi) deleted their posts after accusing cursor of using kimi without attribution now the question is: did the deal/authorization happen before or after the leak??
Kimi.ai@Kimi_Moonshot

Congrats to the @cursor_ai team on the launch of Composer 2! We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support. Note: Cursor accesses Kimi-k2.5 via @FireworksAI_HQ ' hosted RL and inference platform as part of an authorized commercial partnership.

English
5
0
10
2.6K
Haitham Bou Ammar
Haitham Bou Ammar@hbouammar·
Why is your 405B model losing to an 8B model? 📉"Context Rot" is a logic problem, not a scale problem. Based on the amazing work of RLMs, we built $\lambda$-RLM: replacing messy AI-generated code with a typed $\lambda$-Calculus runtime. The results: ✅ +21.9 accuracy gain ✅ 4.1x faster latency ✅ 8B models beating 405B We used the Y-combinator to "tie the knot" of recursion, giving LLMs formal guarantees on termination and cost. Stop scaling. Start structuring. Paper Link: github.com/lambda-calculu… Code Link: github.com/lambda-calculu… Enjoy!!! #AI #MachineLearning
Haitham Bou Ammar tweet media
English
5
23
162
14.1K
Lee Robinson
Lee Robinson@leerob·
I'm a big believer in open source, especially as AI improves. It was a miss to not mention the Kimi base in our blog from the start. We'll fix that for the next model 🙏 Their team clarified our usage was licensed in the tweet below. x.com/Kimi_Moonshot/…
Kimi.ai@Kimi_Moonshot

Congrats to the @cursor_ai team on the launch of Composer 2! We are proud to see Kimi-k2.5 provide the foundation. Seeing our model integrated effectively through Cursor's continued pretraining & high-compute RL training is the open model ecosystem we love to support. Note: Cursor accesses Kimi-k2.5 via @FireworksAI_HQ ' hosted RL and inference platform as part of an authorized commercial partnership.

English
195
106
2.1K
328.6K
Reden
Reden@Reden799·
@Presidentlin >china steals everything mercilessly >china distills claude for their own training >cursor takes and uses an open source model (unenforceable license) >NOOOOO NOOOOO YOU CAN'T DO WHAT WE WERE DOING TO YOU! 🤡 (assuming it's even a real tweet)
English
2
0
2
1.3K
Lincoln 🇿🇦
Lincoln 🇿🇦@Presidentlin·
Chinese open source is cancelled. Sorry everyone. Cursor didn't want to pay or give creds. They stold the IP. Anyway, Q1 and Q2 models are still coming down the pipe. Q3 and Q4 management are deciding on the best course of action. My sources in China tell me it doesn't look good. We have to go back to the MS Phi models.
English
26
20
336
65.7K
Tanner Linsley
Tanner Linsley@tannerlinsley·
Ghostty was fun, but time for something else. I still love opencode, too but with CC plans dead on it… I’m feeling lost. Full GUI? T3 Code? Opencode GUI? Warp? Back to cursor? Try CC again? Raw Codex? My 🧠 hurts and I just need to keep shipping.
English
387
7
1.1K
242.4K