jihadjo

792 posts

jihadjo

jihadjo

@jihadjo

g33k

Katılım Temmuz 2010
1.7K Takip Edilen149 Takipçiler
jihadjo retweetledi
𝑮𝒂𝒖𝒕𝒊𝒆𝒓🦁💙
😂 Franchement, c’est assez fou quand tu y penses. Anthropic laisse fuiter le code source de son outil Claude Code… Puis demande à GitHub de le supprimer. Jusque-là, logique. Mais là où ça devient intéressant c'est qu'un développeur a utilisé Codex pour réécrire entièrement ce code… en Python.🤣 Résultat : - Nouveau code - Plus de copie directe - Donc plus vraiment de problème de droits En gros, une IA a recréé le code d’une boîte d’IA… pour contourner les règles mises en place par cette même boîte.😭😭😭 Honnêtement, on est dans une époque où tout est entrain de basculer
Gergely Orosz@GergelyOrosz

This is either brilliant or scary: Anthropic accidentally leaked the TS source code of Claude Code (which is closed source). Repos sharing the source are taken down with DMCA. BUT this repo rewrote the code using Python, and so it violates no copyright & cannot be taken down!

Français
12
40
265
64.6K
jihadjo retweetledi
Robert Youssef
Robert Youssef@rryssf_·
Holy shit... Kimi just found a design flaw that has been inside every LLM since 2015. > Residual connections use fixed uniform weights every layer treated identically, no matter what came before. Replace them with learned attention over preceding layers and you get 1.25x more compute for free. Same model. Same data. Different architecture. > Standard residual connections have been the default building block of every transformer since He et al. 2015. The update is simple: each layer adds its output to the previous hidden state with a fixed weight of 1. Unroll this recurrence and you get the same thing every time a uniform sum of all prior layer outputs, with no mechanism to weight some layers more than others. Early layers and late layers contribute equally regardless of what the model actually needs. The result: hidden state magnitudes grow proportionally with depth, each layer's relative contribution shrinks, and a significant fraction of layers can be pruned with minimal loss. The architecture was built as a gradient highway. Nobody noticed it was also a bottleneck. > The insight is a direct analogy to sequence modeling. RNNs had the same problem over time compressing all prior information into a single fixed-weight state. The transformer solved it by replacing recurrence with attention, letting each position selectively access all previous positions with learned, input-dependent weights. Kimi applied the exact same logic to depth. Instead of a fixed uniform sum across layers, each layer now computes softmax attention over all preceding layer outputs using a single learned query vector per layer. One d-dimensional vector. That's the entire overhead. > The practical challenge: at scale, storing all preceding layer outputs for every layer requires O(Ld) memory and creates communication overhead under pipeline parallelism. Kimi's solution is Block AttnRes partition layers into blocks, compress each block to a single representation, attend over block-level summaries instead of individual layers. Memory drops from O(Ld) to O(Nd). With 8 blocks, Block AttnRes recovers almost all the gains of the full version. Infrastructure overhead under pipeline parallelism: less than 4%. Inference latency overhead: less than 2%. The numbers from the 48B model trained on 1.4T tokens: → Block AttnRes reaches the same validation loss as baseline trained with 1.25× more compute → GPQA-Diamond: 36.9% → 44.4% (+7.5 points) → Math: 53.5% → 57.1% (+3.6 points) → HumanEval: 59.1% → 62.2% (+3.1 points) → MMLU: 73.5% → 74.6% → Improvements consistent across all 15 evaluated benchmarks zero regressions → Scaling laws confirm gains hold across every model size tested → Training overhead under pipeline parallelism: less than 4% → Inference latency overhead: less than 2% → Only addition: one RMSNorm and one learned vector per layer > The training dynamics analysis explains why it works. Standard residuals force deeper layers to produce increasingly large outputs just to maintain influence over the accumulated hidden state the PreNorm dilution problem. Block AttnRes resets this accumulation at block boundaries. > Output magnitudes stay bounded across depth. Gradient norms distribute more uniformly across layers. The model stops wasting capacity compensating for a structural inefficiency and uses it for actual learning. > The architecture sweep reveals something deeper. Under a fixed compute budget, the optimal architecture shifts when you add AttnRes. The baseline prefers wider, shallower networks. AttnRes shifts the optimum toward deeper, narrower ones because with selective depth-wise attention, additional layers are no longer diluted by uniform accumulation. > The inefficiency that made depth expensive to exploit is gone. > Residual connections have been in every transformer for a decade. Nobody questioned the fixed weights until now. One learned vector per layer changes the math entirely.
Robert Youssef tweet media
English
16
59
396
22.9K
jihadjo retweetledi
Kyle Hessling
Kyle Hessling@KyleHessling1·
BREAKING! Jackrong is already cooking up v3 versions of the Opus Fintunes, and the best news? He's calling it Qwopus now! This 9B version 3 was only released hours ago! Q4 is a measly 5.63 GB in size! This is perfect timing, as I had many requests for something good to run on 16 or less GB VRAM! Looking forward to giving it a test, but in the meantime, here's the link! Looks like our momentum is working to keep him going on his great work! Hopefully going to see a 27B v3 with the same improvements soon! huggingface.co/Jackrong/Qwopu…
English
26
68
919
38.2K
jihadjo retweetledi
BridgeMind
BridgeMind@bridgemindai·
Qwen 3.6 Plus Preview just dropped on OpenRouter. 1,000,000 token context window. $0 input. $0 output. It's Free. A million tokens of context for free. This is insane. Claude Opus 4.6 charges $5/$25 per million tokens for 200K context. GPT 5.4 charges for 1M context. Qwen just gave it away. All free. Open source is not slowing down. It's accelerating.
BridgeMind tweet media
English
83
137
1.5K
133.9K
jihadjo retweetledi
🇫🇷One
🇫🇷One@Escalet83·
Le programme d'#ÉdouardPhilippe est de faire barrage au #RN, favoriser l'invasion migratoire mettre la France sous tutelle européenne, censurer les réseaux sociaux, instaurer l'€ numérique afin de contrôler NOTRE argent ⚠️Y-a-t-il des gens qui ont envie de ça ? Je ne pense pas
🇫🇷One tweet media
Français
396
1.8K
3.4K
63.4K
jihadjo retweetledi
ollama
ollama@ollama·
Ollama is now updated to run the fastest on Apple silicon, powered by MLX, Apple's machine learning framework. This change unlocks much faster performance to accelerate demanding work on macOS: - Personal assistants like OpenClaw - Coding agents like Claude Code, OpenCode, or Codex
English
283
726
5.7K
738.9K
jihadjo retweetledi
Nav Toor
Nav Toor@heynavtoor·
🚨 397 billion parameters. On a MacBook. No cloud. No GPU cluster. No data center. A laptop. Someone ran one of the largest AI models on Earth on a machine you can buy at the Apple Store. It's called flash-moe. A pure C and Metal inference engine that runs Qwen3.5-397B on a MacBook Pro with 48GB RAM. At 4.4 tokens per second. With tool calling. No Python. No PyTorch. No frameworks. Just raw C and hand-tuned Metal shaders. Here's why this should not be possible: → The model is 209GB. The laptop has 48GB of RAM. → It streams the entire model from the SSD in real time → Only loads the 4 experts needed per token out of 512 → Uses just 5.5GB of actual memory during inference → Production-quality output with full tool calling → 58 experiments. Hand-optimized Metal compute kernels. → The entire engine is ~7,000 lines of C and ~1,200 lines of Metal shaders Here's the wildest part: One person built this. A VP of AI at CVS Health. Not Google. Not OpenAI. A healthcare company executive. Side project. Used Claude Code as his coding partner. Built the entire engine in 24 hours. Running a 397B model on cloud GPUs costs hundreds of dollars per hour. Companies spend millions per year on inference infrastructure for models this size. This runs on a $3,499 laptop. Offline. Private. No API key. No monthly bill. Forever. Trending on GitHub. 332 points on Hacker News. 100% Open Source.
Nav Toor tweet media
English
114
347
2.6K
199.5K
jihadjo retweetledi
Bleu Blanc Rouge ! 🇫🇷
Bleu Blanc Rouge ! 🇫🇷@LBleuBlancRouge·
Pour rappel, voter Édouard Philippe, c’est voter pour Macron et remettre un macroniste au pouvoir pendant 5 ans. 😉
Bleu Blanc Rouge ! 🇫🇷 tweet media
Français
600
3.3K
9.8K
90K
jihadjo retweetledi
Jon De Lorraine
Jon De Lorraine@jon_delorraine·
🔴🇫🇷 ARCHIVE Quand Édouard Philippe méprisait en direct un gilet jaune retraité qui lui disait qu'il devait encore travailler. "Moi aussi je travaille beaucoup" lui répondait Philippe. C'est lui, le sauveur de la France ? Jamais.
Français
585
5.9K
13.1K
333.5K
jihadjo
jihadjo@jihadjo·
@acormierd C’est quand même un truc de zinzin de parler comme ça à une flic, sans conséquence
Français
0
0
2
49
Alexandre Cormier-Denis
Demain ce seront vos filles, vos femmes et vos mères qui seront traitées de sales putes par des Arabes qu'on aura fait venir au nom de la francophonie. Il faut d'abord cesser de faire venir ces populations et ensuite les remigrer.
Bastion@BastionMediaFR

🔴🇨🇦 𝗙𝗟𝗔𝗦𝗛 𝗜𝗡𝗙𝗢 — « Sale pute de merde, ferme ta gueule, avec ta tête de chienne […] Si tu veux je t’achète et tu deviens mon esclave » : une policière de Montréal violemment insultée, garde son calme face à un individu menaçant.

Français
252
1.3K
5.2K
124.4K
jihadjo retweetledi
Wolf 🐺
Wolf 🐺@PsyGuy007·
🇩🇪 L'évêque Athanasius Schneider s'exprime très clairement : « Ce ne sont pas des réfugiés, ce sont des envahisseurs qui veulent islamiser l'Europe. Ils veulent détruire la culture historique de l'Europe. » 🗣️ Êtes-vous d’accord avec lui ? A. OUI B. NON
Français
1.2K
1.8K
4.5K
28.6K
Le Parisien
Le Parisien@le_Parisien·
Présidentielle 2027 : seul Édouard Philippe serait en mesure de battre le RN au second tour, d’après un sondage ➡️ l.leparisien.fr/1o0E
Le Parisien tweet media
Français
3.6K
260
661
2.4M
jihadjo
jihadjo@jihadjo·
@lesnums Dommage qu’il ait pas dans cette fuite, les fameux sms de l’autre pute
Français
0
0
0
84
Enzo Morel
Enzo Morel@mtwit75·
Emmanuel Grégoire crée 36 postes d'adjoints au maire de Paris, comme Anne Hidalgo en 2020. Rien ne change ...
Enzo Morel tweet media
Français
107
364
1.1K
59.3K
Jean MESSIHA
Jean MESSIHA@JeanMessiha·
À Damas en Syrie, des hordes de musulmans haineux attaquent les quartiers chrétiens en pillant les magasins chrétiens, en incendiant ceux qui vendent de l’alcool et en détruisant des statues religieuses (dont une de la Vierge Marie). Regardez bien car c’est ce qui se passera demain en France. Les territoires islamisés (93, etc) lanceront raids et razzias lcontre les territoires non encore soumis à l’islam pour les punir de leur « apostasie » et les soumettre. Sous les applaudissements de Rima Hassan et des LFistes.
Français
412
2K
3.6K
66.4K