Mustafa Akben, PhD

1K posts

Mustafa Akben, PhD banner
Mustafa Akben, PhD

Mustafa Akben, PhD

@DoktorMoose

Director of AI - Professor, @ElonUniversity (NC). Interests #AppliedAI #AI #Creativity #ScienceOfScience #Management #OrganizationalBehavior

NC Katılım Kasım 2009
1.5K Takip Edilen164 Takipçiler
Mustafa Akben, PhD
Mustafa Akben, PhD@DoktorMoose·
ChatGPT Images 2.0 is something unusual! It can render 4K images with high accuracy and write its own name on a grain of rice! Very exciting, and I have already planned great use cases for it. Amazing work @ChatGPTapp @OpenAI @sama
English
0
0
0
57
Mustafa Akben, PhD
Mustafa Akben, PhD@DoktorMoose·
Wow! For applied machine learning cases, this seems like a really strong solution. However, I am not sure whether it can discover new architectures. It feels like it combines existing ones to meet certain performance metrics. Regardless, great work!
Aksel@akseljoonas

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English
0
0
0
61
Mustafa Akben, PhD
Mustafa Akben, PhD@DoktorMoose·
#migrating-away-from-prefilled-responses" target="_blank" rel="nofollow noopener">platform.claude.com/docs/en/build-…
ZXX
0
0
1
5
Mustafa Akben, PhD
Mustafa Akben, PhD@DoktorMoose·
Mythos Preview might be released public sooner than I expected. I was reading API docs and I have started to see references to it like this.
Mustafa Akben, PhD tweet media
English
1
0
1
27
Tibo
Tibo@thsottiaux·
Hello builders. What are we getting wrong with Codex, what can we improve?
English
2.4K
63
2.8K
309.1K
Mike Knoop
Mike Knoop@mikeknoop·
Extremely clear what caused the qualitative leap from GPT 4 to o1 (test time adaptation via chain of thought reasoning). Not clear what caused the agentic leap from Gemini 2.5/GPT 5.1/Opus 4.1 to Gemini 3/GPT 5.2/Opus 4.5. Even crazier all three released ~3 weeks apart.
English
31
8
444
43.3K
Mustafa Akben, PhD
Mustafa Akben, PhD@DoktorMoose·
Computo, ergo sum! (I compute, therefore I am)
English
0
0
0
8
Mustafa Akben, PhD retweetledi
Claude
Claude@claudeai·
The Claude Code hackathon is back for Opus 4.7. Join builders from around the world for a week with the Claude Code team in the room, with a prize pool of $100K in API credits. Apply by Sunday: cerebralvalley.ai/e/built-with-4…
English
554
906
10.4K
1.5M
Randy Olson
Randy Olson@randal_olson·
Even Opus 4.7 is failing this one. Failure modes like this are interesting because they demonstrate the jagged frontier of AI. This same model can write a compiler from scratch, and yet it gets tripped up on small things like this.
Randy Olson tweet media
English
8
2
9
1.4K