Incorretos
80.1K posts

Incorretos
@IncorretosBlog
Dividir coisas que achei engraçadas, inclui tradução de conteúdo(HN e Pol. Incorreto, Memes, Cartoons) Não obriguei você a seguir, siga por conta e risco

Mesmo tendo cordas, esses pulo parecia bem perigoso, coisas que deveria ter proibido faz tempo Ainda mais depois do ocorrido, segundo as notícias que dois tentaram fugir e esconder



Mano essas merdas só acontecem no Brasil


Alibaba Qwen3.7 slowly fading into irrelevance at the frontier due to proprietary stance. In it's place we have Minimax M3 and... *checks notes* Rio 3.5 397b, made by the municipal IT company of Rio de Janeiro's city government. huggingface.co/prefeitura-rio…



O ChatGPT em 2050 procurando o cara que pediu pra ele avaliar a foto do cacete:





Exciting news: Claude Fable 5 ranks #1 on the new Agent Arena leaderboard! Fable 5 leads by the widest margin ever over Opus-4.8 and GPT-5.5 on two key signals: confirmed task success rate and praise vs. complaint, despite weaker steerability. If Fable can do something, it will do it very well. If it can't/doesn't want to do something, it may be hard to steer the model towards the goal. In Agent Arena, we measure models on millions of real-world, long-horizon agentic tasks. Models get web search, filesystem, and terminal tools to complete complex workflows: writing code, creating slide deck, researching the web, building apps, and analyzing documents. We use the causal tracing methodology to measure a model's net improvement which indicates how much it improves outcomes relative to the average model. Huge congrats to @AnthropicAI for the incredible milestone! Below we break down how Claude Fable 5 (based on Mythos) scored across 5 signals, drawn from tasks submitted by a global community of users.

















