🎓 Martin Dougiamas

5.4K posts

🎓 Martin Dougiamas banner
🎓 Martin Dougiamas

🎓 Martin Dougiamas

@moodler

@Moodle Founder, Chairman+Head of Research, @OpenEdTech Founder, @OpenEdGlobal Board Director. Working to promote Openness globally. https://t.co/mV98mkFU1B

Perth, Australia Katılım Eylül 2008
836 Takip Edilen11.5K Takipçiler
Sabitlenmiş Tweet
🎓 Martin Dougiamas
🎓 Martin Dougiamas@moodler·
Looks like the tool that was mirroring my Mastodon posts here broke recently, or was blocked, or something. I'd try to fix it but honestly I really don't care enough. Twitter is really so noisy and toxic. So, bye! if you want, come follow me at @martin" target="_blank" rel="nofollow noopener">openedtech.social/@martin
English
1
1
9
767
AmericanPapaBear™
AmericanPapaBear™@AmericaPapaBear·
@elonmusk @CommunityNotes I say just scrap the algo and go super basic. People who follow you want to see your content. Simple. People with larger followings shouldn't be throttled or punished because they grinded and gained those followers. We put in the work.
English
16
4
150
26.2K
Elon Musk
Elon Musk@elonmusk·
We’ve open sourced the 𝕏 algorithm not because we think it’s smart, but to show it’s MANY flaws! Every week, we try to make it better, not always succeeding, but to be as transparent we are with @CommunityNotes. Like democracy, it’s the worst algorithm, except for all the others.
English
5.6K
8.5K
96.5K
14.9M
Elon Musk
Elon Musk@elonmusk·
At times, AI existential dread is overwhelming
English
9.1K
4.8K
81.5K
23.9M
Sam Altman
Sam Altman@sama·
how about we fix our model naming by this summer and everyone gets a few more months to make fun of us (which we very much deserve) until then?
English
1.4K
343
16.2K
1.9M
Jay Anderson
Jay Anderson@TheProjectUnity·
🚨"We Have Weaponry, Nobody Has Any Idea What It Is" What weapons do you think Trump is referring to? 👀
English
1.1K
316
3K
754.7K
🎓 Martin Dougiamas
🎓 Martin Dougiamas@moodler·
@OpenAI How about using your own AI to come up with a good naming/versioning scheme for your products And stop using OpenAI while you’re at it because it means nothing
English
0
0
0
100
OpenAI
OpenAI@OpenAI·
GPT-4o got an another update in ChatGPT! What's different? - Better at following detailed instructions, especially prompts containing multiple requests - Improved capability to tackle complex technical and coding problems - Improved intuition and creativity - Fewer emojis 🙃
English
841
928
13.6K
3.2M
🎓 Martin Dougiamas
🎓 Martin Dougiamas@moodler·
@teortaxesTex Instead of worrying about one billionaire’s wallet you could be wishing for the cheap availability of all technology to all, no matter who builds it
English
0
0
0
42
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
Elon: EVs are the future! Invest into my revolutionary EV company China: *commoditizes EVs. EVs with gull-wing doors, EVs with drone launchers, plush EVs flood the market* Elon: humanoids are the future! Opti- China: *Unitree, LimX, MagicBot…* Elon: Reusable ro- man it must suck
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media
covrovski@covrovski

@teortaxesTex China has so many reusable rocket first launches this year. Many companies. Here your going to see a very similar track to what happened with EV's in China.

English
28
34
419
34.4K
Andrej Karpathy
Andrej Karpathy@karpathy·
@IndraVahan Great question right? I'd love to know, I don't think I fully understand this either. But considering that noone has (to my knowledge) figured out a way to post-train an LLM to be funny, I am prepared to believe humor is really difficult and requires more underlying capability?
English
173
37
1.5K
175.3K
Andrej Karpathy
Andrej Karpathy@karpathy·
I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check. Thinking ✅ First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan question: "Create a board game webpage showing a hex grid, just like in the game Settlers of Catan. Each hex grid is numbered from 1..N, where N is the total number of hex tiles. Make it generic, so one can change the number of "rings" using a slider. For example in Catan the radius is 3 hexes. Single html page please." Few models get this right reliably. The top OpenAI thinking models (e.g. o1-pro, at $200/month) get it too, but all of DeepSeek-R1, Gemini 2.0 Flash Thinking, and Claude do not. ❌ It did not solve my "Emoji mystery" question where I give a smiling face with an attached message hidden inside Unicode variation selectors, even when I give a strong hint on how to decode it in the form of Rust code. The most progress I've seen is from DeepSeek-R1 which once partially decoded the message. ❓ It solved a few tic tac toe boards I gave it with a pretty nice/clean chain of thought (many SOTA models often fail these!). So I upped the difficulty and asked it to generate 3 "tricky" tic tac toe boards, which it failed on (generating nonsense boards / text), but then so did o1 pro. ✅ I uploaded GPT-2 paper. I asked a bunch of simple lookup questions, all worked great. Then asked to estimate the number of training flops it took to train GPT-2, with no searching. This is tricky because the number of tokens is not spelled out so it has to be partially estimated and partially calculated, stressing all of lookup, knowledge, and math. One example is 40GB of text ~= 40B characters ~= 40B bytes (assume ASCII) ~= 10B tokens (assume ~4 bytes/tok), at ~10 epochs ~= 100B token training run, at 1.5B params and with 2+4=6 flops/param/token, this is 100e9 X 1.5e9 X 6 ~= 1e21 FLOPs. Both Grok 3 and 4o fail this task, but Grok 3 with Thinking solves it great, while o1 pro (GPT thinking model) fails. I like that the model *will* attempt to solve the Riemann hypothesis when asked to, similar to DeepSeek-R1 but unlike many other models that give up instantly (o1-pro, Claude, Gemini 2.0 Flash Thinking) and simply say that it is a great unsolved problem. I had to stop it eventually because I felt a bit bad for it, but it showed courage and who knows, maybe one day... The impression overall I got here is that this is somewhere around o1-pro capability, and ahead of DeepSeek-R1, though of course we need actual, real evaluations to look at. DeepSearch Very neat offering that seems to combine something along the lines of what OpenAI / Perplexity call "Deep Research", together with thinking. Except instead of "Deep Research" it is "Deep Search" (sigh). Can produce high quality responses to various researchy / lookupy questions you could imagine have answers in article on the internet, e.g. a few I tried, which I stole from my recent search history on Perplexity, along with how it went: - ✅ "What's up with the upcoming Apple Launch? Any rumors?" - ✅ "Why is Palantir stock surging recently?" - ✅ "White Lotus 3 where was it filmed and is it the same team as Seasons 1 and 2?" - ✅ "What toothpaste does Bryan Johnson use?" - ❌ "Singles Inferno Season 4 cast where are they now?" - ❌ "What speech to text program has Simon Willison mentioned he's using?" ❌ I did find some sharp edges here. E.g. the model doesn't seem to like to reference X as a source by default, though you can explicitly ask it to. A few times I caught it hallucinating URLs that don't exist. A few times it said factual things that I think are incorrect and it didn't provide a citation for it (it probably doesn't exist). E.g. it told me that "Kim Jeong-su is still dating Kim Min-seol" of Singles Inferno Season 4, which surely is totally off, right? And when I asked it to create a report on the major LLM labs and their amount of total funding and estimate of employee count, it listed 12 major labs but not itself (xAI). The impression I get of DeepSearch is that it's approximately around Perplexity DeepResearch offering (which is great!), but not yet at the level of OpenAI's recently released "Deep Research", which still feels more thorough and reliable (though still nowhere perfect, e.g. it, too, quite incorrectly excludes xAI as a "major LLM labs" when I tried with it...). Random LLM "gotcha"s I tried a few more fun / random LLM gotcha queries I like to try now and then. Gotchas are queries that specifically on the easy side for humans but on the hard side for LLMs, so I was curious which of them Grok 3 makes progress on. ✅ Grok 3 knows there are 3 "r" in "strawberry", but then it also told me there are only 3 "L" in LOLLAPALOOZA. Turning on Thinking solves this. ✅ Grok 3 told me 9.11 > 9.9. (common with other LLMs too), but again, turning on Thinking solves it. ✅ Few simple puzzles worked ok even without thinking, e.g. *"Sally (a girl) has 3 brothers. Each brother has 2 sisters. How many sisters does Sally have?"*. E.g. GPT4o says 2 (incorrectly). ❌ Sadly the model's sense of humor does not appear to be obviously improved. This is a common LLM issue with humor capability and general mode collapse, famously, e.g. 90% of 1,008 outputs asking ChatGPT for joke were repetitions of the same 25 jokes​. Even when prompted in more detail away from simple pun territory (e.g. give me a standup), I'm not sure that it is state of the art humor. Example generated joke: "*Why did the chicken join a band? Because it had the drumsticks and wanted to be a cluck-star!*". In quick testing, thinking did not help, possibly it made it a bit worse. ❌ Model still appears to be just a bit too overly sensitive to "complex ethical issues", e.g. generated a 1 page essay basically refusing to answer whether it might be ethically justifiable to misgender someone if it meant saving 1 million people from dying. ❌ Simon Willison's "*Generate an SVG of a pelican riding a bicycle*". It stresses the LLMs ability to lay out many elements on a 2D grid, which is very difficult because the LLMs can't "see" like people do, so it's arranging things in the dark, in text. Marking as fail because these pelicans are qutie good but, but still a bit broken (see image and comparisons). Claude's are best, but imo I suspect they specifically targeted SVG capability during training. Summary. As far as a quick vibe check over ~2 hours this morning, Grok 3 + Thinking feels somewhere around the state of the art territory of OpenAI's strongest models (o1-pro, $200/month), and slightly better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. Which is quite incredible considering that the team started from scratch ~1 year ago, this timescale to state of the art territory is unprecedented. Do also keep in mind the caveats - the models are stochastic and may give slightly different answers each time, and it is very early, so we'll have to wait for a lot more evaluations over a period of the next few days/weeks. The early LM arena results look quite encouraging indeed. For now, big congrats to the xAI team, they clearly have huge velocity and momentum and I am excited to add Grok 3 to my "LLM council" and hear what it thinks going forward.
Andrej Karpathy tweet media
English
666
2.2K
16.8K
3.7M
S H
S H@SteveHa61539743·
@HarrierBR @WallStreetApes Dumb ass. Trump is hardly gonna be involved given he is pushing for the release of the files
English
3
0
5
1.5K
Wall Street Apes
Wall Street Apes@WallStreetApes·
FINALLY ‼️ The Jeffrey Epstein files are being released!! “In the next 10 days, you’re going to see the Epstein files released… Day number 1, Kash Patel walks in, by the end of the day, it’ll be released.” — Glenn Beck Jeffrey Epstein Victim: “I have spend the last 17 years in my own prison for what she, Jeffery & all the coconspirators did to me. I was raped repeatedly, I was raped 3x per day sometimes and I was not the only girl on that island. There was a constant stream of girls being raped over and over and over again”
English
2.2K
21.7K
90.5K
6.4M
Zyphra
Zyphra@ZyphraAI·
Today, we're excited to announce a beta release of Zonos, a highly expressive TTS model with high fidelity voice cloning. We release both transformer and SSM-hybrid models under an Apache 2.0 license. Zonos performs well vs leading TTS providers in quality and expressiveness.
English
136
434
2.6K
1.2M
Andrej Karpathy
Andrej Karpathy@karpathy·
btw I didn't do comprehensive research on this, I just try random stuff and compare over time, and I don't have too much confidence to recommend the right one for this yet. I happened to be using SuperWhisper recently and I'm happy with it functionality wise. I will say that by default I don't like when data from my computer goes anywhere outside of my computer via an opaque app. I prefer fully super duper fully offline apps (no pinging home, no updating unless I ask, no analytics no nothing), whenever possible, and I think speech to text should be a setting where this should be possible just fine. I saw earlier that @simonw use MacWhisper so I have a todo to try that next. @jordibruin says in the app readme that "All transcription is done on your device, no data leaves your machine." and it's a one-time purchase.
English
62
51
1.9K
195.3K
@levelsio
@levelsio@levelsio·
Best way to talk witth my voice to an LLM so it writes my code like @karpathy does?
English
206
36
2.1K
955K
Neighbors First | Mike Brooks
Neighbors First | Mike Brooks@DrMikeBrooks·
@karpathy Um, but I feel like I learned something valuable that I can put to use from your long tweet - what does this mean?🤔
English
1
0
0
589
Andrej Karpathy
Andrej Karpathy@karpathy·
# on shortification of "learning" There are a lot of videos on YouTube/TikTok etc. that give the appearance of education, but if you look closely they are really just entertainment. This is very convenient for everyone involved : the people watching enjoy thinking they are learning (but actually they are just having fun). The people creating this content also enjoy it because fun has a much larger audience, fame and revenue. But as far as learning goes, this is a trap. This content is an epsilon away from watching the Bachelorette. It's like snacking on those "Garden Veggie Straws", which feel like you're eating healthy vegetables until you look at the ingredients. Learning is not supposed to be fun. It doesn't have to be actively not fun either, but the primary feeling should be that of effort. It should look a lot less like that "10 minute full body" workout from your local digital media creator and a lot more like a serious session at the gym. You want the mental equivalent of sweating. It's not that the quickie doesn't do anything, it's just that it is wildly suboptimal if you actually care to learn. I find it helpful to explicitly declare your intent up front as a sharp, binary variable in your mind. If you are consuming content: are you trying to be entertained or are you trying to learn? And if you are creating content: are you trying to entertain or are you trying to teach? You'll go down a different path in each case. Attempts to seek the stuff in between actually clamp to zero. So for those who actually want to learn. Unless you are trying to learn something narrow and specific, close those tabs with quick blog posts. Close those tabs of "Learn XYZ in 10 minutes". Consider the opportunity cost of snacking and seek the meal - the textbooks, docs, papers, manuals, longform. Allocate a 4 hour window. Don't just read, take notes, re-read, re-phrase, process, manipulate, learn. And for those actually trying to educate, please consider writing/recording longform, designed for someone to get "sweaty", especially in today's era of quantity over quality. Give someone a real workout. This is what I aspire to in my own educational work too. My audience will decrease. The ones that remain might not even like it. But at least we'll learn something.
English
643
3.4K
17.1K
2.2M
🎓 Martin Dougiamas retweetledi
Kialo Edu
Kialo Edu@KialoEdu·
Today’s the day: #MootGlobal24 is ON! Come find us in the Sponsors Exhibition area to learn how Kialo can revolutionize your discussion forums. We have a dedicated Moodle Certified Integration to share, not to mention some delicious candy 🍬
Kialo Edu tweet media
English
0
2
5
918
🎓 Martin Dougiamas
🎓 Martin Dougiamas@moodler·
@MisterHazen @moodle Those are just active accounts on registered sites too. Doesn't include non-registered OR former students from the past 22 years
English
0
0
3
58
Ryan Hazen
Ryan Hazen@MisterHazen·
I’ve been thinking about this a lot lately. @moodle impact across the world is hard to overstate. Mentioned in the #MootGlobal24 keynote that these numbers are likely low to the actual worldwide use.
Ryan Hazen tweet media
English
1
3
4
294
🎓 Martin Dougiamas
🎓 Martin Dougiamas@moodler·
Until now, the only people who have been able to benefit from having intelligences doing all the boring jobs for them have been those individuals who are relatively rich and powerful. #AI as a foundational technology really does have the longterm… @martin/113157430887924330" target="_blank" rel="nofollow noopener">openedtech.social/@martin/113157…
English
0
0
2
253
Ccm
Ccm@CcmDisanddat·
@adalluch It’s Trump or I’m moving, temporary Visa. South Korea is amazing! Felt very safe and welcomed. One of my favorite places! Have what I now consider family there
English
1
0
8
13.4K
🎓 Martin Dougiamas
🎓 Martin Dougiamas@moodler·
Who else is thinking about a strong #AI for the #Mastodon #Fediverse? Something that is #learning primarily from all our opinions and discussions and shared data/observations from people and sensors, sorting through it using widely-acceptable and open… @martin/113153704985381254" target="_blank" rel="nofollow noopener">openedtech.social/@martin/113153…
English
0
0
2
282