Hamel Husain

16.5K posts

Hamel Husain banner
Hamel Husain

Hamel Husain

@HamelHusain

Bringing data science back to AI - https://t.co/Zrmp6LRd9c About Me: https://t.co/P6WyeKkyTa

Looking at the data Katılım Eylül 2012
2.5K Takip Edilen46.5K Takipçiler
jason liu
jason liu@jxnlco·
New Forms of Automation
jason liu tweet media
English
7
2
36
4.2K
Hamel Husain
Hamel Husain@HamelHusain·
@keeran Good question. It feels like the consumer will be agents
English
0
0
0
25
keeran
keeran@keeran·
@HamelHusain which one spends? much of what we suffer atm is due to a really long tail of uninformed and uninterested spenders.
English
1
0
0
34
Hamel Husain
Hamel Husain@HamelHusain·
My theory of people gaining traction with slop posts is you are trading a smart audience for a dumber audience
English
12
1
88
6.7K
Hamel Husain retweetledi
Isaac Flath
Isaac Flath@isaac_flath·
I tried the new GPT Images model. I tried it 9 thumbnail concepts. All ones I came up with. For each concept I made 4 images. 2 with openai, 2 with gemini. I liked Gemini more on 6 concepts. I like the new ChatGPT Images model on 3. I think Gemini is better at diagram and concept style. The openai model is better at more realistic looking things (like people, or product things). New openai model does pull toward hype, even more than gemini. But it’s manageable with prompts. I tend to prefer diagram or concept style over realistic style, so I think that makes me lean Gemini. But I think given a concept, I could easily predict which model would be better at it and pick the right one. There’s a few caveats. If you want faces incorporated in automatically, openai model will win almost always because it’s so much better at faced. I prefer steering to a particular idea/concept, then overlay a real image as needed still though. But, I predict most people will like the new OpenAI model more. I think my taste in this is not the norm. Openai does a better job making social cards that look like “good” social cards.
English
0
1
5
1.7K
Hamel Husain retweetledi
Kun Chen
Kun Chen@kunchenguid·
lol remember this org chart meme? I just created a full simulation for all of them with agents, and the results blew my mind! the simulation asked each organization to build and ship a web spreadsheet want to take a guess who built the best product? reveal in thread below!
Kun Chen tweet media
English
29
59
753
180.4K
Hamel Husain
Hamel Husain@HamelHusain·
@xeophon That's because you have no slop posts. It's all nice human organic.
English
0
0
3
313
Florian Brand
Florian Brand@xeophon·
@HamelHusain you need a larger audience so the fraction of smart people is bigger in absolute terms as well. i wouldn't know it, all my followers are smart
English
7
0
30
1.3K
Hamel Husain
Hamel Husain@HamelHusain·
@BEBischof If you are wealthy enough to invest $100k and get similar O(x) upside as employee then it definitely stops making sense (I’m not at this point yet fwiw)
English
1
0
0
412
Bryan Bischof fka Dr. Donut
Right now i know a small but growing set of people who basically cant/wont work for others. They're top of their field in some dimension(s) but just cannot be bothered to have a boss. It's especially interesting because the compensation distribution has such higher outliers than it used to be i feel like i see more of these cases now
English
2
0
10
1.7K
jason liu
jason liu@jxnlco·
reminder that unlike codex, openai employees are actually people and the thank yous actually mean something
English
75
10
1K
33.3K
Paul Iusztin
Paul Iusztin@pauliusztin_·
Every day, 100+ people ask me, "How can I learn AI evals?" I copy-paste these 11 links (every time): 1. AI evals & observability (series): decodingai.com/t/ai-evals-and… 2. Using LLM-as-a-judge: hamel.dev/blog/posts/llm… 3. Demystifying evals for AI agents: anthropic.com/engineering/de… 4. There are only 6 RAG Evals: jxnl.co/writing/2025/0… 5. Evaluation-driven development: decodingai.com/p/stop-launchi… 6. Binary evals vs. Likert scales: decodingai.com/p/the-5-star-l… 7. The mirage of generic AI metrics: decodingai.com/p/the-mirage-o… 8. Error analysis: youtube.com/watch?v=e2i6Jb… 9. Carrying out error analysis: youtube.com/watch?v=e2i6Jb… 10. Evaluating the effectiveness of LLM-evaluators: eugeneyan.com/writing/llm-ev… 11. LLM judges aren't the shortcut you think: youtube.com/watch?v=sEMYSS… Binge these to skyrocket your skills.
YouTube video
YouTube
YouTube video
YouTube
Paul Iusztin tweet media
English
10
86
767
81.5K
Hamel Husain retweetledi
Tibo
Tibo@thsottiaux·
Happy Tuesday. Codex has hit 4M active users, adding over 1M users in less than two weeks. To celebrate we will reset the rate limits again in a few hours. Enjoy!
English
374
195
5.4K
717.3K
Hamel Husain retweetledi
Isaac Flath
Isaac Flath@isaac_flath·
Oh wow, Marimo been working on their Agentic/AI capabilities. I didn't realize how much they've done here with live kernel/agent integration. Worth a watch. I'm excited to try this. youtube.com/watch?v=6uaqtc…
YouTube video
YouTube
English
1
3
22
3.9K
Hamel Husain retweetledi
Isaac Flath
Isaac Flath@isaac_flath·
I’m open sourcing ​agentkb​, the system I use to let the agent see things I’ve learned or that it’s done before, so that it can do the same kinds of things quicker. Agentkb currently stores agent chats, X posts, wiki, and skills. isaacflath.com/writing/agentkb
English
1
5
12
2.1K
Hamel Husain retweetledi
Tibo
Tibo@thsottiaux·
We are releasing a *research preview* of Chronicle in Codex. It allows codex to build up memories based on your day to day work on your computer and then refer to these memories to be a lot more helpful. Available for PRO subscriptions and on Mac to start. This is early and consumes quite a bit of tokens, but it has changed how I and many folks at OpenAI use Codex.
OpenAI Developers@OpenAIDevs

Last week, we released a preview of memories in Codex. Today, we’re expanding the experiment with Chronicle, which improves memories using recent screen context. Now, Codex can help with what you’ve been working on without you restating context.

English
235
151
2.6K
916.1K
Mario Zechner
Mario Zechner@badlogicgames·
hi, my name is mario, i'm building coding agents and other LLM bullshittery, and i think the parent is directionally right. kids + AI is a spectrum. handing them a sycophantic LLM unsupervised and uninstructed is on the "terrible fucking idea" end of that spectrum.
Justine Moore@venturetwins

I am so sad for this kid

English
58
18
480
38.3K
Hamel Husain
Hamel Husain@HamelHusain·
@lukcombinator It really is not the same at all. Cowork doesn't have as deep of an OS integration I suggest trying it (it might not be available in your geo) before jumping to conclusions
English
1
0
1
568
Lukas
Lukas@lukcombinator·
@HamelHusain I do that with claude cowork - but happy for you working with Codex
English
1
0
0
610
Hamel Husain
Hamel Husain@HamelHusain·
Lots of people asking what’s so good about the new codex desktop computer use. Here’s 5 things that come to mind 1. operate Mac Apps without a great API: Slack, Google Sheets, Notes, IMessage without installing separate plugins. It instantly transforms all your apps into tools 2. If you need to operate your browser more visually it works really smoothly and fast (good for sites that are still human centric) 3. It uses its own cursor, keyboard etc so you can keep working. 4. Once you do any task once you can simply ask Codex to reflect on what it did and how it would accomplish the task next time with the benefit of hindsight and create a skill AND schedule an automation. It’s really nice that codex can just schedule and edit automations when asked! it’s very Claw like in this way. This last point is not computer use specific but is powerful when combined with computer use 5. The UI polish is insane: you get nice icons for any application you want to tag into computer use plus all the other built in new stuff like built in file viewer and browser so there is no context switching. So you can iterate really fast and not lose focus. Because of the polish it also feels nice and delightful to use.
Hamel Husain@HamelHusain

Seriously stop everything you are doing and use codex desktop app new computer use. Absolutely mind blowing

English
30
66
927
164.5K
Hamel Husain retweetledi
Chris Tate
Chris Tate@ctatedev·
Terminal automation + e2e testing solved Now as simple as snapshot, click, type: – wterm renders terminal-in-html, every cell in the a11y tree – agent-browser automates pages via the a11y tree Here's opencode in one browser driving Claude Code in another
English
107
213
3.4K
957.1K
Bryan Bischof fka Dr. Donut
Bryan Bischof fka Dr. Donut@BEBischof·
If I asked you to Like RT this tweet, you would: Strongly Agree: 🔘 Agree: 🔘 Neutral: 🔘 Disagree: 🔘 Strongly Disagree: 🔘
English
3
2
3
0