AlphaSignal AI

677 posts

AlphaSignal AI

@AlphaSignalAI

The latest news from the top 100 companies in AI. Over 300,000 devs read our newsletter.

Signup → Entrou em Şubat 2010

329 Seguindo15.1K Seguidores

AlphaSignal AI@AlphaSignalAI·9h

Paper: arxiv.org/abs/2605.23904 Repo: github.com/microsoft/Skil… Website: microsoft.github.io/SkillOpt/

English

191

AlphaSignal AI@AlphaSignalAI·9h

x.com/i/article/2058…

ZXX

1.6K

AlphaSignal AI@AlphaSignalAI·11h

Repo: github.com/Lum1104/Unders…

Español

213

AlphaSignal AI@AlphaSignalAI·11h

x.com/i/article/2058…

ZXX

1.4K

AlphaSignal AI@AlphaSignalAI·12h

Paper: arxiv.org/abs/2604.08510

English

173

AlphaSignal AI@AlphaSignalAI·12h

Researchers cracked the hidden order behind how AI learns. Loss curves tell you a model is improving. They don't say which skills form, or in what order. A new paper proposes the Implicit Curriculum Hypothesis. Pretraining follows a hidden, predictable order across families. Researchers built 91 tasks covering string operations, morphology, translation, logic, and math. They tracked 9 open-weight models from 410M to 13B parameters. The sequence was strikingly consistent across runs: > Simple copying emerges first > Then morphology and translation appear > Basic arithmetic follows after that > Complex reasoning shows up last Composite skills almost always emerge after their components. Spearman correlation hit .81 across 45 model pairs. The structure also lives inside the network. Tasks with similar internal representations follow similar learning curves. They predicted held-out trajectories with R² up to .84, without running evaluations. So how early could you spot a frontier run drifting?

English

743

AlphaSignal AI@AlphaSignalAI·15h

Paper: arxiv.org/abs/2605.15156

English

237

AlphaSignal AI@AlphaSignalAI·15h

Researchers just gave LLMs a separate brain for memory. Language models go stale the moment training ends. Updating them risks breaking what they already know. A new paper proposes MeMo. It pairs any LLM with a separate trained memory model. The base model stays frozen. Knowledge gets internalized into a small dedicated model instead. The pipeline runs in three steps: > Extract facts from documents > Train memory on those facts > Query it through sub-questions When fresh data arrives, new memories merge in without retraining from scratch. This cuts compute by 33%. Retrieval cost stays constant regardless of corpus size. The frozen LLM treats memory as an external oracle. Across three benchmarks, it beats BM25, dense retrieval, and graph RAG. It plugs into closed proprietary models since everything runs through natural language. So what happens when memory stops being a context window hack?

English

1.3K

AlphaSignal AI@AlphaSignalAI·1d

Repo: github.com/algorithmicsup…

Español

405

AlphaSignal AI@AlphaSignalAI·1d

You can now boost any LLM's accuracy 2-10x without training it. Most teams improve model accuracy by fine-tuning or swapping to a bigger model. Both cost time and money. OptiLLM takes a different route. It is an open-source proxy that sits between your app and any OpenAI-compatible API. Instead of training, it spends extra compute at inference time to think harder before answering. The repo bundles 20+ reasoning techniques you can switch on with one parameter. A few of the methods inside: > Multi-agent cross-verification > Monte Carlo tree search > Chain-of-thought with reflection > Best-of-N sampling > Z3 theorem prover routing The numbers are the headline. On AIME 2025, Gemini 2.5 Flash Lite jumps from 43.3% to 73.3% accuracy. Llama 3.3 70B gains 18.6 points on Math-L5. GPT-4o-mini matches GPT-4 on Arena-Hard-Auto. No retraining. Just route your calls through the proxy.

English

2.4K

AlphaSignal AI@AlphaSignalAI·1d

Repo: github.com/github/spec-kit

Español

328

AlphaSignal AI@AlphaSignalAI·1d

GitHub just fixed the biggest problem with vibe coding. Most agents fail the same way. You give a vague prompt and hope they don't break the project. Spec Kit works differently. It forces the AI to write a structured specification BEFORE touching any code. The agent reads what you want, asks about missing details, lays out the project, then starts building. You drive it through six slash commands: 1. /constitution sets the rules 2. /specify describes the goal 3. /clarify surfaces open questions 4. /plan picks the stack 5. /tasks lists ordered steps 6. /implement runs the build Every step writes a Markdown file the next one reads. It works with Claude Code, Cursor, Copilot, Codex, Gemini CLI, and 25 more agents. The open-source repo crossed 95K stars and 8K forks in days. What would you build first with it?

English

880

AlphaSignal AI@AlphaSignalAI·2d

Repo: arxiv.org/html/2605.0142…

Español

768

AlphaSignal AI@AlphaSignalAI·2d

Google just figured out why AI lies with confidence. Large language models still make confident mistakes on simple factual questions. A new paper from Google Research explains why this keeps happening. Models cannot reliably tell what they know from what they are guessing. The internal score separating right answers from wrong ones sits around 0.70 to 0.85. Forcing strict accuracy backfires. Cutting errors from 25% to 5% means staying silent on over half of correct answers. The team proposes faithful uncertainty. The model's words should match its actual internal confidence. Instead of refusing to answer, it hedges honestly. "I think" becomes a real signal, not filler. This same awareness tells agents when to reach for search tools. The paper flags open problems worth tackling: > Static training versus shifting knowledge > Alignment erasing confidence signals > Misleading calibration metrics dominating evaluation

English

293

15.2K

AlphaSignal AI@AlphaSignalAI·2d

Repo: github.com/ferdous-alam/G…

Español

330

AlphaSignal AI@AlphaSignalAI·2d

MIT just open-sourced a model that could end the $150/hour CAD industry. Turning a photo of a physical part into an editable 3D model usually takes an engineer weeks inside proprietary software. Every revision means starting from scratch. GenCAD breaks that bottleneck entirely. Upload a single image of an object and it generates the full parametric program behind it. Not a mesh. Not a point cloud. The actual command sequence an engineer would have written by hand. Fully editable, exportable as STL, and ready for manufacturing. The system stitches together four pieces: > Transformer encoder for commands > Contrastive learner aligning images > Latent diffusion for generation > Decoder producing final geometry It can also retrieve the closest match from thousands of existing programs in seconds. So what happens when industrial design becomes a free upload instead of a contract?

English

675

AlphaSignal AI@AlphaSignalAI·2d

@dannytt Yeah! It works well for most people

English

Danny Thuering@dannytt·2d

@AlphaSignalAI I do have really good experience working with #openspec. 👨🏻‍💻 Much better structured and scoped requirements. Better outcomes. 👏

English

AlphaSignal AI@AlphaSignalAI·3d

Spec-driven development became the default AI coding architecture 67-source academic review all agreed 5 repos defining it + 1 saying they're all wrong: spec-kit · BMAD · Open-spec · GSD · superpowers and Pocock's skills How to choose? or should adapt a feature from each one?

AlphaSignal AI@AlphaSignalAI

x.com/i/article/2057…

English

166

31.1K

AlphaSignal AI@AlphaSignalAI·2d

@0xMetaLabs Thats the point, winning is not a single workflow

English

0xMetaLabs@0xMetaLabs·2d

@AlphaSignalAI People keep asking which framework wins. Spec-kit, BMAD, Open-spec, GSD, skills - they may end up behaving less like competitors and more like software primitives. The winning workflow could be a stack, not a single methodology.

English

AlphaSignal AI@AlphaSignalAI·2d

@n3lson Right! The “what happens when code drifts” is a so good point to watch

English

110

Jeremy@n3lson·2d

@AlphaSignalAI The useful spec is not a doc the agent reads once. It is a completion contract: accepted inputs, forbidden shortcuts, checks to run, evidence to produce, and what happens when code drifts from the spec.

English

136

AlphaSignal AI@AlphaSignalAI·2d

@MarkPommrehn Appreciate it! Glad it helped. Now lets 10x work..

English

Mark R Pommrehn@MarkPommrehn·2d

@AlphaSignalAI Excellent info and perspectives! Great education! Thank you!

English

AlphaSignal AI@AlphaSignalAI·4d

x.com/i/article/2057…

ZXX

299

56.1K

Descobrir

@dannytt @0xMetaLabs @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA