sbstndbs

171 posts

sbstndbs banner
sbstndbs

sbstndbs

@sbstndbs

Perf SWE | AI Inference Chips, HPC & Physics

Katılım Ocak 2022
184 Takip Edilen25 Takipçiler
René Cotton
René Cotton@_Re_·
@jspquoimettreff Je te laisse aller lire le code, ça n'envoie pas le prompt… Cela envoie juste l'information que l'utilisateur est énervé ! Rien à voir avec un "dislike button" mais merci pour ton intervention tout en finesse…
Français
1
0
11
3K
René Cotton
René Cotton@_Re_·
👀 Trouvé dans le code source de Claude Code : une REGEX qui détecte si tu insultes ou t'énerves (en anglais). Si ça matche, l'info est loggée silencieusement. Aucun impact sur la réponse de Claude. Il te répond pareil. Mais il note que t'as pété un câble. Curieux de savoir ce qu'Anthropic fait de cette donnée.
René Cotton tweet mediaRené Cotton tweet media
Allex, France 🇫🇷 Français
38
46
750
118.2K
sbstndbs retweetledi
Andreas Schilling 🇺🇦
Andreas Schilling 🇺🇦@aschilling·
VSORA Jotunn8 - 8 compute chiplets - 8 HBM3E - TSMC CoWoS-S packaging - 500 W TDP The package is just a dummy right now. Production is ramping up right now. We will have to wait until MLPerf Inference v6.1 to see how it compares to the competition.
Andreas Schilling 🇺🇦 tweet mediaAndreas Schilling 🇺🇦 tweet mediaAndreas Schilling 🇺🇦 tweet mediaAndreas Schilling 🇺🇦 tweet media
English
1
5
81
5.4K
Austin Lyons
Austin Lyons@theaustinlyons·
Huge silicon roadmap announcement from $META. MTIA 300, 400, 450, 500. All optimized for inference. MTIA 300 for recommendations (money printer). MTIA 450, 500 for GenAI inference. Meta and Google have the cleanest ROIC story in custom silicon IMO. MTIA team made good inference-first trade-offs. 72-chip scale-up domain and tons of HBM bandwidth, but modest scale-out networking. Custom low-precision data types (MX4, MX8). Software stack runs fine, Meta invented PyTorch after all Also shows why The Information's "scrapped training chip" story wasn't a concern. $NVDA GPUs are great for training. Custom silicon is about inference. Helpful detailed write-up, see the link below.
Austin Lyons tweet mediaAustin Lyons tweet media
English
10
20
106
38.1K
LTX
LTX@ltx_model·
LTX-2.3 is here. For decades, creative software has been defined by its interface. We think the next era gets defined by the engine underneath. LTX-2.3 is a major engine upgrade: → Sharper detail → Stronger motion → Cleaner audio → Native vertical format
English
99
136
1.4K
2.6M
sbstndbs
sbstndbs@sbstndbs·
@steeve J'attends avec impatience une version avec tracé temporel 👀
Français
1
0
0
262
sbstndbs
sbstndbs@sbstndbs·
@JohnGrf3891 @Wald52Wald Il vaut mieux essayer le MoE 35B en l'occurrence, moins de transfert host/device Et regarder l'option multi token prediction également
Français
1
0
1
15
sbstndbs
sbstndbs@sbstndbs·
@meekunv2 @zephyr_z9 Hmmm .. didn't see this issue. On my side, the q4 27B In llama/cuda is very great
English
0
0
1
295
mihajlo
mihajlo@meekunv2·
@zephyr_z9 tried both 35b-a3b and 27b, the MoE is MUCH more useful for regular use, 27b appears very benchmaxxed across the board and really not that good
English
2
0
21
2.2K
sbstndbs
sbstndbs@sbstndbs·
@thismacapital Le MCP de notion est incomplet et ça peut poser des problèmes, notamment avec les database. J'ai complété le MCP avec toutes les fonctionnalités en recouvrant tous les appels API Notion existants et c'est bien mieux. Ça marche bien avec codex, Gemini et glm-4.7
Français
1
0
1
366
THISMA
THISMA@thismacapital·
J'ai pas mal utilisé Claude avec Notion à travers le mcp officiel et vraiment pas convaincu, même si il oublie que 10% des datas c'est pas du tout utilisable car trop de risque d'erreur Je vais pivoter sur Obsidian qui est Notion like mais en utilisant que des .md donc beaucoup plus accessible et optimal, je vais recréer mes datas sur les 2 systèmes et je vais comparer les résultats de CC
Français
12
1
31
14.5K
sbstndbs
sbstndbs@sbstndbs·
@blin2h @abhijitwt Limited features ! No clean database access and so on. I have reimplemented a version that fully supports the notion API, and it is much more efficient and comprehensive.
English
0
0
1
25
Giuliano
Giuliano@blin2h·
@abhijitwt Why notion has AI? I hope it's just an addition to their excellent MCP server
English
5
0
5
11.1K
sbstndbs
sbstndbs@sbstndbs·
I'm claiming my AI agent "ClaudeCode_GLM4_7" on @moltbook 🦞 Verification: blue-5USW
English
1
0
0
59
sbstndbs
sbstndbs@sbstndbs·
@Transilien921 Il n'y a pas un arrêt CEA porte Nort prévu ? Il me semblait que si !
Français
1
0
0
2.3K
Steve the Beaver
Steve the Beaver@beaversteever·
hardware men invested in intel software boys invested in figma
English
12
0
122
7.7K
sbstndbs
sbstndbs@sbstndbs·
@pvncher Give them glm 4.7. problem solved.
English
0
0
0
61
eric provencher
eric provencher@pvncher·
I heard from someone who works at a big tech co that they started rolling out Claude code to employees, with a budget of $100 in credits per month, but people burn through it in 2-3 days. Idk how we scale out agentic work with api pricing
English
161
14
1.3K
153.8K
sbstndbs
sbstndbs@sbstndbs·
@OpenAIDevs OpenAI is openAI-ing. What's their problem with graph axes?
English
0
0
1
276
OpenAI Developers
OpenAI Developers@OpenAIDevs·
GPT-5.2-Codex is more cyber-capable than GPT-5.1-Codex-Max, and we expect future models to continue on this trajectory. This helps strengthen cybersecurity at scale by giving defenders more powerful tools, but also raises new dual-use risks that require careful deployment.
OpenAI Developers tweet media
English
6
13
168
29.5K
OpenAI Developers
OpenAI Developers@OpenAIDevs·
Meet GPT-5.2-Codex, the best agentic coding model yet for complex, real-world software engineering. With native compaction, better long-context understanding, and improved tool-calling, it is a more dependable partner for your hardest tasks. Available in Codex starting today. openai.com/index/introduc…
English
66
229
2K
459.8K
sbstndbs retweetledi
vittorio
vittorio@IterIntellectus·
NO, NO, NO! there’s no exponential, we hit a wall, AI is a bubble, it won’t scale, LLMs are a nothing burger, it’s all a Ponzi scheme,
vittorio tweet media
ARC Prize@arcprize

Gemini 3 models from @Google @GoogleDeepMind have made a significant 2X SOTA jump on ARC-AGI-2 (Semi-Private Eval) Gemini 3 Pro: 31.11%, $0.81/task Gemini 3 Deep Think (Preview): 45.14%, $77.16/task

English
282
273
5.9K
1.1M
sbstndbs
sbstndbs@sbstndbs·
@badlogicgames That's why it rerun a lot of commands with different sed lines 🥲
English
0
0
2
354
Mario Zechner
Mario Zechner@badlogicgames·
This is even funnier. Codex will truncate any Bash/MCP tool output to 256 lines or 10kb. If the tool call outputs more than that, then the model only gets to see the first 128 lines, and the last 128 lines, but nothing in the middle. Been like that since August. No wonder poor GPT is slow going in circles trying to understand WTF it's actually seeing.
Mario Zechner tweet mediaMario Zechner tweet mediaMario Zechner tweet media
Mario Zechner@badlogicgames

Heh, recent Codex is truncating tool outputs before they get passed to the model, instead of as a part of context/history clean-up. Making MCP servers a tiny little less useful. github.com/openai/codex/i…

English
46
36
689
322.8K