Ma_Ch

681 posts

Ma_Ch

@Nitram_Writer

❔

Beigetreten Haziran 2012

307 Folgt73 Follower

Ma_Ch@Nitram_Writer·1h

@Fabien_Mikol @HieroDeiis C'est moi où il se contredit ? Comment ça peut être le choix statistique de la réponse la plus probable et que la réponse change tout le temps pour la même question ? Justement, si c'était purement statistique on aurait bien moins de variance non ? 👀

Français

Fabien@Fabien_Mikol·5h

@HieroDeiis explique pourquoi les LLM ne sont pas intelligents. Les ingénieurs et ceux qui s'y connaissent ne serait-ce qu'un peu : regardez, c'est tout simplement lunaire. Chaque slide est incroyable. Il ne comprend manifestement rien à ce qu'il raconte, c'est complètement fou.

Français

1.4K

Ma_Ch@Nitram_Writer·2h

@SkylerMiao7 @itsjustmarky @opencode When Skyler promises, he delivers guys, buckle up!

English

Skyler Miao@SkylerMiao7·4h

@Nitram_Writer @itsjustmarky @opencode Got you working hard on it

English

Ma_Ch@Nitram_Writer·10h

Minimax 2.7 is not multimodal? I spent my time pasting screenshot in @opencode with kimi k2.5, @SkylerMiao7 will Minimax models have vision in the future?

English

3.9K

Ma_Ch@Nitram_Writer·6h

@itsjustmarky @opencode @SkylerMiao7 Maybe I'll give it a shot after I finish evaluating M2.7 workflow with multiple subagents regarding large features (spoiler: works quite great at 90% efficacy)

English

sudo rm -rf@itsjustmarky·6h

@Nitram_Writer @opencode @SkylerMiao7 You can run Qwen 3.5 27B it has vision and is very strong as it is dense, even competing with MiniMax M2.5 (provided you are not doing extreme coding).

English

Ma_Ch@Nitram_Writer·6h

@itsjustmarky @opencode @SkylerMiao7 Indeed but I'm using vision probably >150 times a day, because I make multiple rounds on frontend and UX, so vision is mandatory in my workflow. Hence, I prefer only one model I know the behaviour of, so I can call a skill or subagents instead of jumping between two specialists.

English

sudo rm -rf@itsjustmarky·6h

@Nitram_Writer @opencode @SkylerMiao7 Agreed, but I'd rather not dumb down my primary model. You using a separate agent also keeps your context from filling up. And you can point to a cloud provider to run Qwen 9b or 27B for pennies. Unless you are doing thousands of images a day, it works great.

English

Ma_Ch@Nitram_Writer·7h

@itsjustmarky @opencode @SkylerMiao7 Yeah but it's an extra workaround and an extra model to work with. Since they have all their pro/cons, I avoid this kind of setup.

English

sudo rm -rf@itsjustmarky·7h

@Nitram_Writer @opencode @SkylerMiao7 You can create an agent that is dedicated to vision with a vision model and easily supplement it.

English

106

Ma_Ch@Nitram_Writer·7h

@arisberikut @opencode @SkylerMiao7 @MiniMax_AI I try to keep my setup as clean and simple as possible. So, an extra tool while Kimi has it integrated feels like a burden.

English

Aris Cursor 🔥@arisberikut·7h

@Nitram_Writer @opencode @SkylerMiao7 If you love CLI, try : github.com/madebyaris/nat… It's already included "native" MCP of understand_image (I rewrote it to Rust), So you can paste the screenshot to the CLI. I created it using @MiniMax_AI M2.7 + Cursor.

Gedangan, Indonesia 🇮🇩 English

206

Ma_Ch@Nitram_Writer·7h

@SkylerMiao7 @opencode Interesting, looking forward M3!

English

1.1K

Skyler Miao@SkylerMiao7·8h

@Nitram_Writer @opencode Sure, in M3.

English

239

31.5K

Ma_Ch@Nitram_Writer·2d

@DFintelligence Oui, d'où l'importance de croiser les sources. Mais Composer 2 serait une version RL de Kimi K2.5 donc le benchmark paraît cohérent.

Français

Defend Intelligence (Anis Ayari)@DFintelligence·2d

Wait a minute… Franchement, ça devient indécent les communications autour des graphiques et des benchmarks. Tout le monde fait du cherry-pick. Ça devient ridicule, à un moment. Maintenant on invente même ces propres benchmarks "interne" pour avoir un joli grpahique où on est premier... alala

Defend Intelligence (Anis Ayari) tweet media

Cursor@cursor_ai

Composer 2 is now available in Cursor.

Français

21.4K

Ma_Ch@Nitram_Writer·2d

@cortisquared Je pense qu'il ne faut pas oublier que les modèles chinois sont aussi en réponse au blocus américain. La Chine cherche à avoir une place sur le plan international et cette position politique est poussée par le Parti et les grosses entreprises BATX

Français

108

Corti (Cortiste)@cortisquared·2d

La réalité c’est qu'on est *en ce moment* en train d'assister à une forte commodification des modèles. N'importe quel boite avec de la data et des thunes pour le compute peut sortir un modèle pertinent. Xiaomi vient de sortir un modèle à 1T paramètres apparemment très pertinent

Français

3.7K

Ma_Ch@Nitram_Writer·4d

@Fabien_Mikol ça me rappelle ce post viral d'il y a qq jours d'une ingénieure IA alignement qui a laissé OpenClaw trier ses mails et OC a commencé à tout supprimer, sans pouvoir le stopper malgré des messages clairs. La réalisation de la tâche semble être instoppable🫥

Français

Fabien@Fabien_Mikol·4d

@Nitram_Writer Non, on ne comprend pas bien pourquoi ils ignorent à ce point cet aspect du system prompt

Français

Fabien@Fabien_Mikol·4d

"Les modèles ne résistent au shutdown que dans des scénarios complètement fictifs, artificiels et irréalistes" La réalité : cette résistance existe vraiment même pour des tâches banales, et même si le prompt system précise explicitement qu'il ne faut pas résister au shutdown...

Jeffrey Ladish@JeffLadish

@perrymetzger @robbensinger @playborhood @patrissimo We observed shutdown resistance on some of the very first prompts we tried: openreview.net/forum?id=e4bTT…

Français

1.4K

Ma_Ch@Nitram_Writer·14 Mar

@cortisquared J'étais très impatient de tester Devstral 2 avec Mistral vibe cli mais tous mes tests ont été bien en deça de mes attentes malheureusement

Français

580

Corti (Cortiste)@cortisquared·14 Mar

Ça fait 1000 ans que je n’ai pas entendu parler d’une innovation technologique de mistral.

Français

14K

Ma_Ch@Nitram_Writer·13 Mar

@gchampeau @Zai_org Glm. 4.7 sur Cursor était mon daily driver pendant longtemps. Glm 5 a fait énormément monté les prix, qualité présenté mais token output beaucoup beaucoup trop lent.

Français

Guillaume Champeau@gchampeau·12 Mar

Certains ont essayé GLM-5 via @Zai_org ici ? Pas encore fait de mon côté mais la différence de prix est tellement folle que je me demande si la différence de qualité justifie de rester sur Claude ou Codex, pour des projets persos.

Alexis GTM@twicewest94

C'est moi ou le plan Claude à 20€ existe juste pour te frustrer assez pour que tu passes à 90€ ?

Français

Ma_Ch@Nitram_Writer·12 Mar

@scaling01 You mean it's Poe favorite model?

English

358

Lisan al Gaib@scaling01·12 Mar

it seems like GPT-5.4 is the Garlic model

English

456

56.2K

Ma_Ch@Nitram_Writer·12 Mar

@Fabien_Mikol Il y a encore son copyright en "2021" en bas à gauche, je vais casser quelque chose...

Français

Fabien@Fabien_Mikol·12 Mar

Grâce aux photos des fans, on sait qu'il a pu reproduire sa merveille "démonstration mathématique" prouvant définitivement que l'intelligence humaine est inatteignable. Quelle chance pour le public ! x.com/Fabien_Mikol/s…

Fabien@Fabien_Mikol

Je ne laisserai jamais personne se moquer des démonstrations mathématiques de Luc Julia 😡 Contemplez cette merveilleuse "démonstration par l'absurde géométrique", et sans tableau en plus ! Les acclamations tant de l'animateur que du public de @DevoxxFR devraient vous calmer 😠

Français

1.7K

Fabien@Fabien_Mikol·12 Mar

Succès de la conférence hier à Troyes de Luc Julia, "l"un des grands spécialistes mondieux de l'intelligence artificielle". Le message est bien passé : on "bouscule les idées reçues", et on sait désormais que "les machines ne sont pas près de remplacer l'intelligence humaine" 🤗

Fabien@Fabien_Mikol

@TechnopoleAube explique pourquoi ils ont décidé d'inviter Luc Julia pour une nième conférence à Troyes : "démystifier" l'IA, et faire comprendre que l'IA "n'est pas la menace à laquelle on s'attend". On voit l'intérêt premier : rassurer le grand public et les acteurs économiques

Français

2.9K

Ma_Ch@Nitram_Writer·12 Mar

Je me rappelle que le livre de Barjavel "Ravage"(1943) imagine un futur où un immense bloc de cellules souches est tranché, puis se régénère pour créer de la nouvelle viande de façon quasi-infinie. Je n'avais pas envisagé que ça arriverait si vite.

Kai Micah Mills@kaimicahmills

the ultimate solution is through technology we engineer what has been called a bodyoid: brainless animal bodies that provide as much meat as we desire without harming any sentient beings this would transform medicine - the same platform would allow us to grow organs on demand, eliminate transplant waiting lists, and produce perfectly matched tissues for each patient experimental therapies could be tested on full biological systems without involving conscious animals, regenerative medicine would accelerate as entire replacement tissues become manufacturable in the same way that agriculture turned food from a scarce resource into an abundant one, engineered bodyoids would turn biological material into infrastructure - meat without slaughter, organs without donors, and medical research without sentient suffering

Français

Ma_Ch retweetet

dax@thdxr·10 Mar

sent this to the team today everything great comes from being able to delay gratification for as long as possible and it feels like we're collectively losing our ability to do that

English

254

707

6.9K

962.3K

Ma_Ch@Nitram_Writer·12 Mar

@teortaxesTex It's probably out for testing before extra finetuning, I'll wait for the official release, and juicy study academic papers.

English

235

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·12 Mar

@Nitram_Writer Maybe but I really doubt it. It works, it's not even a terrible model

English

538

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·12 Mar

If Hunter Alpha is DeepSeek-V4-1T, then DeepSeek-Web must be like 3T. It's significantly sharper. It's also sharper tan it was a month ago. It allocates reasoning better: less rumination where not needed, actually tries on hard tasks. And – better data taste. Hunter is… okay.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media

English

121

12K

Entdecken

@Fabien_Mikol @HieroDeiis @SkylerMiao7 @itsjustmarky @opencode @arisberikut @MiniMax_AI @DFintelligence