Micha Mazaheri

1.6K posts

Micha Mazaheri banner
Micha Mazaheri

Micha Mazaheri

@mittsh

Back to building a SaaS 🚀 • Advisor & co-founder @electricbeastco • Founded Paw API testing tool (exit in 2020) • #Biotech #PrecisionFermentation enthusiast

Europe Katılım Ekim 2009
1.8K Takip Edilen841 Takipçiler
JC
JC@anakin·
The recent days and weeks have shown how everybody desperately needs an eval framework to know: - which model performs best for each of THEIR tasks (assistant, design, architecture, coding, review, etc) - how structuring their prompts (in the broad sense: CLAUDE/AGENT .md, skills, user prompts, etc) improves/decreases the quality of their outputs - how switching to a new model impacts them. Huge opportunity if someone can bring to the 99% a product easy to setup and affordable.
English
3
0
5
1.9K
Micha Mazaheri
Micha Mazaheri@mittsh·
@anakin @MistralAI @OpenAI @AnthropicAI But just much cheaper. I’m using batch API (categorization work), and something that is very poorly documented by @MistralAI yet awesome, is that I get prompt caching discounts on top of the discount for batch inference. Again, evals allowed me to optimize for input caching.
English
1
0
1
33
Micha Mazaheri
Micha Mazaheri@mittsh·
@anakin I can confirm that it has changed the way I build prompts and test models. As an example, I realized that @MistralAI Mistral Small model was offering (for my use case) similar performance than @OpenAI 4o-mini (coming from my legacy code) and even @AnthropicAI Haiku.
English
1
0
1
58
Micha Mazaheri retweetledi
Quality QR
Quality QR@quality_qr·
The QR code industry is full of liars "free forever" that costs $29/month "no watermarks" that plaster your codes "unlimited" with daily limits interfaces that look like 2003 got tired of it. built the alternative. no tricks. no ugly UI. no bs. quality-qr.app
English
0
1
6
111
Micha Mazaheri
Micha Mazaheri@mittsh·
@RnaudBertrand 100%. But to be clear, Sarkozy put France back in NATO’s integrated military command. France always has been in NATO itself, even under De Gaulle.
English
0
0
0
35
Arnaud Bertrand
Arnaud Bertrand@RnaudBertrand·
This is a huge news for France, and for Libya. The only other French leader ever sent to prison in the entire history of the French republic was Pétain (the leader of Vichy France), just to give an idea of how extreme the level of treachery must be for France to imprison a former head of state. And in this case it was indeed particularly treacherous: Sarkozy was condemned for criminal conspiracy, soliciting millions of euros from Libya's Gaddafi to fund his 2007 presidential campaign. And as we all know he then cynically turned around later when he was president to lead the very NATO bombing campaign that toppled his former benefactor (to be clear, he was not condemned for this later part, only the criminal conspiracy). Libya which, by the way, used to be Africa's richest country and is now but a mere shadow of its former self... The guy quite literally destroyed an entire country, and it looks like he did it (at least to some extent) to cover the tracks of his corruption. If that's not treacherous, I don't know what is. Plus, in my own opinion, he was the worst president of France in modern times. The guy who killed the little independence we used to have on the international stage by putting us back in NATO (after De Gaulle had rightly left decades prior); there's a reason why in the US they used to call him "Sarko the American" (cbsnews.com/news/sarko-the…). Really treacherous all around.
Arnaud Bertrand tweet media
English
246
2.6K
8.8K
485.4K
el_raf
el_raf@el_raf·
@airplusnews On va finir par souhaiter la methode Reagan… 1- Virer tous les grévistes 2- Donner le controle aerien aux militaires qq mois 3- Supprimer le droit de grève pour cette profession 4- Reembaucher les motivés sur les nouvelles conditions
Français
2
0
14
1.3K
air plus news
air plus news@airplusnews·
🔴 Les contrôleurs aériens français prévoient une nouvelle grève le 18 septembre.
air plus news tweet media
Français
30
46
213
43.5K
Micha Mazaheri retweetledi
Samy Djemaoun
Samy Djemaoun@DjemaounSamy·
1/🇫🇷14 ans. Visa valide. Venue passer des vacances, elle finit enfermée à l'aéroport par décision du @Interieur_Min...qui sera jugée illégale par le juge. Motif du refus d'entrée : elle a mal répondu aux questions de la PAF… posées sans interprète. djemaoun-avocat.com/post/enfermeme…
Français
122
1.2K
2.4K
141.2K
Micha Mazaheri retweetledi
Santiago
Santiago@svpino·
Front-end development. 2010 - 2024.
English
384
475
10.9K
1.1M
Micha Mazaheri
Micha Mazaheri@mittsh·
LLMs keep blowing my mind almost every day. It's cliché, but I wouldn't want to live/work without them.
English
3
0
5
138
Micha Mazaheri retweetledi
Paul Graham
Paul Graham@paulg·
@patrickc Roughly as if you had Sheldon Cooper walking by your side?
English
46
16
1.4K
116.9K
Micha Mazaheri retweetledi
Arnaud Bertrand
Arnaud Bertrand@RnaudBertrand·
So many answers to my post are of the type "we're owning China by banning foreign students to Harvard", "we get nothing for educating foreigners", "with this we'll teach Americans and not our competitors", etc. It's astonishing to see how ignorant Americans can be with regards to the source of their power. And also, judging by the intellectual caliber of these responses, it demonstrates in itself that if you limit U.S. universities to American students only, the standards will crater. I hate to break it to you but this move will not even remotely "own China". Quite the contrary this is one of the biggest self-own in American history. The whole challenge of competing with the U.S. is that you're not only competing with Americans but with the collective brainpower of 140+ countries concentrated in American institutions. Including, as is often the case, competing with your own country's brightest minds who joined the "opposite camp" as it were. Furthermore, even if they go back home after their studies, these students often become part of their countries' elites and maintain lifelong networks, friendships, and cultural affinities with America, ensuring that when they're making decisions as CEOs, ministers, or judges, they have an instinctive understanding of and sympathy for the U.S.'s perspective. Want to throw all that away? Be my guest. But I'll end this post with the concluding sentence of historian Arnold Toynbee in his 12-volume "A Study of History" where he writes: "Civilizations are not murdered. Instead, they take their own lives."
Arnaud Bertrand@RnaudBertrand

Unbelievable, they banned Harvard's ability to enroll international students. It's batshit insane, there's no other way to put it. "We have the best university in the world that's been standing for 400 years, let's kill it"

English
302
849
4.2K
545.3K
Micha Mazaheri retweetledi
Manouck
Manouck@Manouck44·
Toute la haine déversée sur nos compatriotes musulmans me ramène à celle déversée sur mes aïeux. Je voudrais leur exprimer toute ma solidarité. Notre fraternité sera notre salut. Les héritiers de Drumond et Maurras sont nos ennemis communs.
Manouck tweet media
Français
255
857
1.9K
61.8K
Micha Mazaheri
Micha Mazaheri@mittsh·
People who don't use AI to code, do research, summarize documents, and even produce content in 2025 are like people who refused to use the internet in 2000. It's a self-inflicted disadvantage.
English
0
0
1
75
Micha Mazaheri
Micha Mazaheri@mittsh·
Batch processing by @GroqInc makes so much sense. I run LLM completions from a queue service, which anyways takes days to process. It just makes sense that inference providers take the whole batch and process it when they have more available resources.
Micha Mazaheri tweet media
English
0
0
0
82
Micha Mazaheri
Micha Mazaheri@mittsh·
@jeffr_yyy Hey Jeffrey! Confident AI & DeepEval look awesome, that's exactly what I've been looking for. Quick questions: - Are you supporting multimodal (vision)? In prompts and dataset? - It would be awesome if Human Feedback would support custom React components to display Thx!
English
1
0
2
208
Jeffrey 🐬 confident-ai.com
🚨 95% of LLM evaluations fail to deliver value—why? 🤔 Because most teams are unknowingly evaluating the wrong thing. Typical LLM metrics sound great: - Correctness: "Did the model get the facts right?" ✅ - Answer Relevancy: "Did it directly answer the question?" 🎯 - Faithfulness: "Did it avoid hallucinations?" 🔎 - Tonality: "Did it match the desired voice?" 🗣️ But here's the issue: Your LLM doesn't exist simply to be correct, relevant, or faithful. It exists to deliver ROI—reducing customer support costs, saving analyst hours, or increasing customer satisfaction. 📈💸🤑🤑 Metrics must correlate to real-world outcomes. Your test case passing rates should confidently predict tangible business impact—more ticket resolutions, reduced internal workload, increased efficiency. When you build this metric-to-outcome connection, evaluations finally mean something. Improvements in your LLM’s performance become improvements in your business metrics. How do you do this right (@confident_ai )? 👉 Humans-in-the-loop, with metric alignment. - Curate just 25–50 human-labeled "good" or "bad" real-world OUTCOMES. - These aren't metric scores—these are OUTCOMES like support tickets being resolved or closing your LLM app in frustration. Whatever your product KPI is, you know it better than me. - Figure out the set of metrics would produce a test result-outcome correlation through trial and error. Forget synthetic data. Forget vanity metrics. If your evaluation data doesn't represent real users and real outcomes, you're evaluating in the dark. 🕶️ We unpacked exactly how to establish these connections clearly, practically, and repeatedly in our latest guide: 👉 The Ultimate LLM Evaluation Playbook:confident-ai.com/blog/the-ultim…
English
1
1
4
738
Micha Mazaheri
Micha Mazaheri@mittsh·
This is exactly what I was looking for 👏 We truly live in a different world. I'm building an AI product, realize I need a tool to test the results of the LLMs, I use @perplexity_ai to find the right answer, it directly me to @confident_ai
Jeffrey 🐬 confident-ai.com@jeffr_yyy

🚨 95% of LLM evaluations fail to deliver value—why? 🤔 Because most teams are unknowingly evaluating the wrong thing. Typical LLM metrics sound great: - Correctness: "Did the model get the facts right?" ✅ - Answer Relevancy: "Did it directly answer the question?" 🎯 - Faithfulness: "Did it avoid hallucinations?" 🔎 - Tonality: "Did it match the desired voice?" 🗣️ But here's the issue: Your LLM doesn't exist simply to be correct, relevant, or faithful. It exists to deliver ROI—reducing customer support costs, saving analyst hours, or increasing customer satisfaction. 📈💸🤑🤑 Metrics must correlate to real-world outcomes. Your test case passing rates should confidently predict tangible business impact—more ticket resolutions, reduced internal workload, increased efficiency. When you build this metric-to-outcome connection, evaluations finally mean something. Improvements in your LLM’s performance become improvements in your business metrics. How do you do this right (@confident_ai )? 👉 Humans-in-the-loop, with metric alignment. - Curate just 25–50 human-labeled "good" or "bad" real-world OUTCOMES. - These aren't metric scores—these are OUTCOMES like support tickets being resolved or closing your LLM app in frustration. Whatever your product KPI is, you know it better than me. - Figure out the set of metrics would produce a test result-outcome correlation through trial and error. Forget synthetic data. Forget vanity metrics. If your evaluation data doesn't represent real users and real outcomes, you're evaluating in the dark. 🕶️ We unpacked exactly how to establish these connections clearly, practically, and repeatedly in our latest guide: 👉 The Ultimate LLM Evaluation Playbook:confident-ai.com/blog/the-ultim…

English
0
0
1
164
Micha Mazaheri retweetledi
Tibor Blaho
Tibor Blaho@btibor91·
OpenAI announced a partnership with Estonia to roll out ChatGPT Edu in all secondary schools, starting with 10th and 11th graders by September 2025, as part of the country’s AI Leap 2025 initiative to provide free AI tools and teacher training
Tibor Blaho tweet media
English
42
79
680
64.8K
Micha Mazaheri retweetledi
Paul Graham
Paul Graham@paulg·
I didn't realize this till recently, but when big, bureaucratic organizations create incubators, the startups that come out of them are damaged from the start by focusing on big customers instead of the scrappy early adopters that successful startups usually sell to.
English
101
218
3.5K
226.1K