anonymous086505

45 posts

anonymous086505

@anonymous086505

Katılım Mayıs 2020

322 Takip Edilen47 Takipçiler

@valtism @theo That is the $1000 dollar question, isn't it? Just so you know...I've given my transcript to Theo, so if he considers it worthy, he'll talk about it soon.

English

3.4K

Dan Wood 🫳🏼@valtism·2d

@anonymous086505 @theo Which clanker?

English

2.9K

anonymous086505@anonymous086505·2d

My clanker 🤖 solved @theo puzzle in 5 min. 😱 You guys didn't stand a chance.

English

8.9K

anonymous086505@anonymous086505·2d

@theo ;)

QAM

5.3K

Theo - t3.gg@theo·2d

It is solved. You fuckers are fast. Congrats to @anonymous086505

English

342

58.3K

Theo - t3.gg@theo·2d

For no reason in particular, I made my first crypto challenge. I will pay $1,000 to whoever solves it first. Winner is whoever gets the answer into my DMs first.

English

114

1.1K

609.6K

anonymous086505@anonymous086505·2d

@theo Solved

English

2.5K

anonymous086505@anonymous086505·2d

@outsource_ @OpenAI You get an error: ■ {"type":"error","status":400,"error":{"type":"invalid_request_error","message":"The 'gpt-5.5' model is not supported when using Codex with a ChatGPT account."}}

English

8.5K

Eric ⚡️ Building...@outsource_·2d

I knew @OpenAI had more up their sleeve. Official announcement in a few hours. Run this in codex cli 👇🏻 codex --model gpt-5.5 Get access to GPT 5.5 NOW!!!!!

English

550

60.7K

anonymous086505@anonymous086505·2d

@TheAmolAvasare You might have the worst PR department ever.

English

486

Amol Avasare@TheAmolAvasare·2d

Getting lots of questions on why the landing page / docs were updated if only 2% of new signups were affected. This was understandably confusing for the 98% of folks not part of the experiment, and we've reverted both the landing page and docs changes.

Amol Avasare@TheAmolAvasare

For clarity, we're running a small test on ~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected.

English

237

541

470.1K

anonymous086505@anonymous086505·3d

@ArtificialAnlys @Kimi_Moonshot @openrouter please update your moonshotai/kimi-k2.6 page with AA info

English

Artificial Analysis@ArtificialAnlys·3d

Moonshot’s Kimi K2.6 is the new leading open weights model. Kimi K2.6 lands at #4 on the Artificial Analysis Intelligence Index (54) behind only Anthropic, Google, and OpenAI (all 57) Key takeaways: ➤ Increase in performance on agentic tasks: @Kimi_Moonshot's Kimi K2.6 achieves an Elo of 1520 on our GDPval-AA evaluation, which is a marked improvement over Kimi K2.5’s Elo of 1309. GDPval-AA is our leading metric for general agentic performance, measuring the performance on knowledge work tasks such as preparing presentations and analysis. Models are given code execution and web browsing tools in an agentic loop via our open source reference agentic harness called Stirrup. This continues Kimi K2.6’s strength in tool use, maintaining a 96% score on τ²-Bench Telecom, placing it among other frontier models in this category. ➤ Low hallucination rate: Kimi K2.5 scores 6 on the AA-Omniscience Index, our knowledge evaluation measuring both accuracy and hallucination rate. This score is primarily driven by a comparatively low hallucination rate of 39% (reduced from Kimi K2.5’s 65%), indicating a greater capability to abstain rather than fabricate knowledge when the model is uncertain. Kimi K2.6’s low hallucination rate places it similarly to other models such as Claude Opus 4.7 (36%) and MiniMax-M2.7 (34%) ➤ High token usage: Kimi K2.6 demonstrates high token usage, but is in line with other frontier models in the same intelligence tier. To run the full Artificial Analysis Intelligence Index, Kimi K2.6 used ~160M reasoning tokens. This is slightly lower than Claude Sonnet 4.6 (~190M reasoning tokens) but much higher than GPT 5.4 (~110M reasoning tokens). ➤ Open weights: Kimi K2.6 is a Mixture-of-Experts (MoE) model with 1T total parameters and 32B active, same as the previous two generations of models Kimi K2 Thinking and Kimi K2.5. Kimi K2.6 again pushes the open weights frontier in intelligence. ➤ Third Party Access: Kimi K2.6 is accessible through Moonshot’s First Party API as well as third party API providers Novita, Baseten, Fireworks, and Parasail ➤ Multimodality: Kimi K2.6 supports Image and Video input and text output natively. The model’s max context length remains 256k. Further analysis in the threads below.

English

131

1.3K

205.3K

anonymous086505@anonymous086505·6d

@jullerino I dont see reasoning levels for OpenCode. I want to see low, medium, high, xhigh toggle. Using T3 Code Nightly. 0.0.21-nightly.20260417.58 (9df3c640210f).

English

492

Julius@jullerino·6d

@anonymous086505 we show both `variant` and `agent` options

English

1.7K

Julius@jullerino·6d

OpenCode and Cursor (early access) support is now in latest Nightly builds. Try it out, send any feedback. Hoping to promote to latest soon!

English

615

86.3K

anonymous086505@anonymous086505·6d

@MatthewBerman @theo @jullerino Get latest nightly build first, then go to settings, to enable cursor and opencode, then you'll see it.

English

149

Matthew Berman@MatthewBerman·6d

@theo @jullerino coming soon

English

421

anonymous086505@anonymous086505·18 Nis

@MilksandMatcha @cerebras I need it just to move away from the big boys. We need strong competition.

English

129

Sarah Chieng@MilksandMatcha·17 Nis

Giving away 5 Windsurf Max ($200/month) plans Each person will get 3 months of free Windsurf Max (highest tier). Try out SWE 1.6, Cognition's latest, fastest, and most intelligent model, powered by @cerebras. Winners will be selected from comments in 48 hours, comment below why you want it.

Cognition@cognition

We’re releasing SWE-1.6, our best model in both intelligence & model UX. SWE-1.6 matches our Preview model on SWE-Bench Pro while dramatically improving on various behavioral axes. It’s available today in Windsurf in two modes: free tier (200 tok/s) and fast tier (950 tok/s).

English

1.1K

858

160.2K

anonymous086505@anonymous086505·15 Şub

@amorriscode Can you guys please update documentation to fix issue 25804

English

Anthony Morris ツ@amorriscode·14 Şub

SSH support is now available for Claude Code on desktop Connect to your remote machines and let Claude cook, TMUX optional.

English

347

321

4.4K

1.6M

anonymous086505@anonymous086505·5 Şub

@testingcatalog @M1Astra Thanks. Claimed.

English

919

TestingCatalog News 🗞@testingcatalog·5 Şub

Claude subscribers can claim $50 worth of credits for TESTING Claude Opus 4.6! Claim it 👀 h/t @M1Astra

TestingCatalog News 🗞@testingcatalog

BREAKING 🚨: CLAUDE OPUS 4.6 IS ROLLING OUT ON THE WEB, APPS AND DESKTOP! TESTING TIME 🔥

English

1.1K

158.3K

anonymous086505@anonymous086505·29 Oca

@ryanvogel The mobile experience forteslanav on iOS is broken, since buttons on the bottom of the screen overlay on top of each other. Tested with safari and brave.

English

anonymous086505@anonymous086505·27 Oca

@steipete @_JohnHammond If the dashboard shouldn’t be exposed publically, the consider a streaming-privacy mode, where it blurs things on screen to prevent accidental exposure.

English

126

Peter Steinberger 🦞@steipete·26 Oca

@_JohnHammond Very fair review, thanks John! I tweaked some of our docs based on your review. github.com/clawdbot/clawd…

English

251

24.2K

John Hammond@_JohnHammond·26 Oca

🦞🤖CLAWDBOT SECURITY??🦞🤖 x.com/i/broadcasts/1…

English

350

115.6K

anonymous086505@anonymous086505·27 Oca

@bcherny Nice feature. But currently missing from official docs.

English

Boris Cherny@bcherny·25 Oca

Hooks can now run in the background without blocking Claude Code's execution. Just add async: true to your hook config. Great for logging, notifications, or any side-effect that shouldn't slow things down.

English

124

175

2.8K

202.1K

anonymous086505@anonymous086505·2 Oca

@trq212 Consider supporting additionalContext in PreToolUse hooks. See issues 15345 and 15664

English

anonymous086505@anonymous086505·2 Oca

@bcherny Consider supporting additionalContext in PreToolUse hooks. See issues 15345 and 15664

English

anonymous086505@anonymous086505·20 Ara

@serafimcloud Want to try

English

serafim@serafimcloud·20 Ara

As a visual person, I hate working in the CLI. But after using Claude Code a few times, it’s impossible to go back. The problem isn’t Claude. It’s the interface around it. So we built the best Claude Code client we could imagine. Runs fully in the browser. Parallel branches and projects. Live previews for every change. A calm, clean UI. Free to use with your own Claude keys. Reply if you want early access 👀

English

536

1.4K

250.9K

anonymous086505@anonymous086505·10 Ara

@kat_kampf @GoogleAIStudio I use it daily. I need it!

English

kat kampf@kat_kampf·10 Ara

We started internal testing some big updates to the @GoogleAIStudio experience today! Coming to you early next year but reply below if you’d like early access in the coming weeks 👀

English

3.1K

125

3.7K

308.2K

Keşfet

@valtism @theo @outsource_ @OpenAI @TheAmolAvasare @ArtificialAnlys @Kimi_Moonshot @OpenRouter