Mitar

1.6K posts

Mitar banner
Mitar

Mitar

@mitar_m

Somewhere and everywhere between (#computer) #science, #technology, #nature and (#open) #society. He/him. Elsewhere: @[email protected] @mitar.bsky.social

Katılım Aralık 2011
1.5K Takip Edilen873 Takipçiler
Mitar
Mitar@mitar_m·
"Oh, we believe your approach is bad and you will fail, but still, let us help you as much as you want us to, to achieve your approach, because we feel confident in superiority of our approach so we do not have to undermine yours."
English
0
0
0
26
Mitar
Mitar@mitar_m·
Through history we have had societies trying different things, but other societies tried to sabotage them: attack them, organize coups, make embargoes, etc. We should help each other.
English
1
0
0
27
Mitar
Mitar@mitar_m·
vitalik.eth.limo/general/2025/1… Interesting read. But I think it lefts out an important detail: all this works if those societies genuinely help each other exist, even if they disagree with others.
English
1
0
0
34
Mitar retweetledi
Sledilnik
Sledilnik@sledilnik·
🎯V Mariboru se dogaja "Napredni seminar komuniciranja znanosti" - 29. 10. 2025. Namen = poglobiti veščine komuniciranja znanosti – od kritičnega vrednotenja in sinteze znanja do jasne predstavitve znanstvene metodologije in raziskovalnih rezultatov. odprtaznanost.si/obvestila/napr…
Slovenščina
0
3
5
978
Mitar
Mitar@mitar_m·
Why not instead of 5% GDP for NATO, we would put 5% of GDP into green transition?
English
0
0
1
81
Mitar
Mitar@mitar_m·
Governmental (and other eIDs) should not phone home. nophonehome.com
English
0
0
3
77
Mitar
Mitar@mitar_m·
You are young while you give up seats on a bus.
English
0
0
2
50
Mitar
Mitar@mitar_m·
Power move on tariffs would be that all other countries drop all remaining tariffs between them, to facilitate trade increase to offset decrease towards USA.
English
0
0
2
57
Mitar retweetledi
Steve El-Hage
Steve El-Hage@hagestev·
✨🎨 Here's why AI can ace medical board exams but can't make real art: Yesterday Minimax released their new MiniMax-Text-01 model. They claim that it’s a MUCH more creative writer than OpenAI's GPT-4o and Anthropic's Sonnet 3.5 models. In reality, they lost control of their evaluation framework, and that caused them to think their model was more creative than it actually is. To start, this is from their model paper, note how much better they say they are at ‘Creative Writing’ than the others: The most interesting part of these new model papers is always the appendix. This is where actual prompt:response pairs are shared, safety and red team tests, and more insights into the evaluation frameworks used to build the model. In the appendix of this MiniMax paper, starting on page 58, they show the User Request (prompt), MiniMax-Text-01 model response, and “Analysis by Human Evaluator”. This “Analysis by Human Evaluator” section is the most important part here. The first example is a lyrics writing test. The AI is asked to create lyrics for a ballad about a traveller who finds an ancient city lost in time. This is the AI output: And most importantly, here’s the “Analysis by Human Evaluator”: If you’ve spent any time with ChatGPT, Claude, or any of the LLMs, this is very clearly an AI written response. I have no proof of this besides “it definitely looks like it was written by AI” and that its definitely not what a normal person would say about these song lyrics. This then happens again on page 59, section B.6: “Story Writing”. The input prompt is to write a story about an adventurer who uncovers a secret, hidden world. It writes the plot of the Lord of the Rings (including a sidekick named Pippin!). This is the output: Again, read the “Human Evaluator” which is definitely chatgpt. I can reproduce within 90% by asking chatgpt to analyze this output. So what happened here? The most likely thing is that Minimax hired some outsourced human evaluators to label and evaluate their LLM output for $2 an hr. Those people then put the poem, story, etc into chatgpt and copy/pasted the response here. Minimax then uses this to train and improve their model. Why does this matter? Model collapse. Everybody sees today that AI models are “smart” but limited in how creative they are. The concept of model collapse happens when an AI model is trained on generated/synthetic data. This causes a harmful feedback loop where over time, these models rely less on the rich, real, valuable information from the real world and more on repetitive, lower-quality synthetic outputs. As a result, their performance degrades, creativity diminishes, and eventually they fail to produce anything that resembles art that’s interesting to people. @teortaxesTex nails it by calling this TASTE COLLAPSE due to synthetic evaluation. It basically means that these AI models will lose all sense of creativity and taste. They’ll keep receiving generic, gpt slop as the “human analysis” on what is creative, use that to train the next round of models, which generate the next slop training dataset, and they'll keep getting further and further from understanding artistic subtly and creativity. An additional problem, identified by @karpathy is that any open (non-private) test dataset inevitably leaks into training sets. This may not be an issue for convergent problems where there's a "correct" answer, but it IS an issue for creative problems, where originality matters a lot. How can this be prevented? In short, give your money to @alexandr_wang and do it right. Get high quality, human labelled creative data to train your models on. Even if you believe in synthetic data as an overall concept for scaling AI models, generated creative content in particular is universally accepted as low quality. Too low quality to be used to train your models on creative writing. I do want to end this by saying that i actually LOVE the MiniMax Hailou video generation models and think they’re by far the best ones in the world right now (better than runway, sora, luma, kling, etc).
Steve El-Hage tweet mediaSteve El-Hage tweet mediaSteve El-Hage tweet mediaSteve El-Hage tweet media
English
10
16
87
12.2K