Hamel Husain

16.5K posts

Hamel Husain

@HamelHusain

Bringing data science back to AI - https://t.co/Zrmp6LRd9c About Me: https://t.co/P6WyeKkyTa

Looking at the data Katılım Eylül 2012

2.5K Takip Edilen46.5K Takipçiler

Sabitlenmiş Tweet

Hamel Husain@HamelHusain·26 Mar

x.com/i/article/2037…

ZXX

227

75K

Hamel Husain@HamelHusain·9h

@jxnlco Is this a codex automation or a Claw

English

453

jason liu@jxnlco·9h

New Forms of Automation

English

4.2K

Hamel Husain@HamelHusain·12h

@keeran Good question. It feels like the consumer will be agents

English

keeran@keeran·12h

@HamelHusain which one spends? much of what we suffer atm is due to a really long tail of uninformed and uninterested spenders.

English

Hamel Husain@HamelHusain·1d

My theory of people gaining traction with slop posts is you are trading a smart audience for a dumber audience

English

6.7K

Hamel Husain retweetledi

Isaac Flath@isaac_flath·14h

I tried the new GPT Images model. I tried it 9 thumbnail concepts. All ones I came up with. For each concept I made 4 images. 2 with openai, 2 with gemini. I liked Gemini more on 6 concepts. I like the new ChatGPT Images model on 3. I think Gemini is better at diagram and concept style. The openai model is better at more realistic looking things (like people, or product things). New openai model does pull toward hype, even more than gemini. But it’s manageable with prompts. I tend to prefer diagram or concept style over realistic style, so I think that makes me lean Gemini. But I think given a concept, I could easily predict which model would be better at it and pick the right one. There’s a few caveats. If you want faces incorporated in automatically, openai model will win almost always because it’s so much better at faced. I prefer steering to a particular idea/concept, then overlay a real image as needed still though. But, I predict most people will like the new OpenAI model more. I think my taste in this is not the norm. Openai does a better job making social cards that look like “good” social cards.

English

1.7K

Hamel Husain retweetledi

Kun Chen@kunchenguid·19h

lol remember this org chart meme? I just created a full simulation for all of them with agents, and the results blew my mind! the simulation asked each organization to build and ship a web spreadsheet want to take a guess who built the best product? reveal in thread below!

English

753

180.4K

Hamel Husain retweetledi

Lisan al Gaib@scaling01·23h

OpenAI just released a new open-source model it's "a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text" github.com/openai/privacy… huggingface.co/openai/privacy…

English

185

2.2K

718.5K

Hamel Husain@HamelHusain·1d

@xeophon That's because you have no slop posts. It's all nice human organic.

English

313

Florian Brand@xeophon·1d

@HamelHusain you need a larger audience so the fraction of smart people is bigger in absolute terms as well. i wouldn't know it, all my followers are smart

English

1.3K

Hamel Husain@HamelHusain·1d

@BEBischof If you are wealthy enough to invest $100k and get similar O(x) upside as employee then it definitely stops making sense (I’m not at this point yet fwiw)

English

412

Hamel Husain@HamelHusain·1d

@BEBischof Connect me with these people 😄

English

1.1K

Bryan Bischof fka Dr. Donut@BEBischof·1d

Right now i know a small but growing set of people who basically cant/wont work for others. They're top of their field in some dimension(s) but just cannot be bothered to have a boss. It's especially interesting because the compensation distribution has such higher outliers than it used to be i feel like i see more of these cases now

English

1.7K

Hamel Husain@HamelHusain·1d

@jxnlco Thanks man

English

566

jason liu@jxnlco·1d

reminder that unlike codex, openai employees are actually people and the thank yous actually mean something

English

33.3K

Hamel Husain@HamelHusain·1d

@hammad_khan23 @pauliusztin_ Its actually on there multiple times! One is a direct link and another is a guest post

English

Muhammad Hammad Khan@hammad_khan23·1d

@pauliusztin_ @HamelHusain Man*

Paul Iusztin@pauliusztin_·2d

Every day, 100+ people ask me, "How can I learn AI evals?" I copy-paste these 11 links (every time): 1. AI evals & observability (series): decodingai.com/t/ai-evals-and… 2. Using LLM-as-a-judge: hamel.dev/blog/posts/llm… 3. Demystifying evals for AI agents: anthropic.com/engineering/de… 4. There are only 6 RAG Evals: jxnl.co/writing/2025/0… 5. Evaluation-driven development: decodingai.com/p/stop-launchi… 6. Binary evals vs. Likert scales: decodingai.com/p/the-5-star-l… 7. The mirage of generic AI metrics: decodingai.com/p/the-mirage-o… 8. Error analysis: youtube.com/watch?v=e2i6Jb… 9. Carrying out error analysis: youtube.com/watch?v=e2i6Jb… 10. Evaluating the effectiveness of LLM-evaluators: eugeneyan.com/writing/llm-ev… 11. LLM judges aren't the shortcut you think: youtube.com/watch?v=sEMYSS… Binge these to skyrocket your skills.

YouTube

English

767

81.5K

Hamel Husain retweetledi

Tibo@thsottiaux·2d

Happy Tuesday. Codex has hit 4M active users, adding over 1M users in less than two weeks. To celebrate we will reset the rate limits again in a few hours. Enjoy!

English

374

195

5.4K

717.3K

Hamel Husain retweetledi

Isaac Flath@isaac_flath·2d

Oh wow, Marimo been working on their Agentic/AI capabilities. I didn't realize how much they've done here with live kernel/agent integration. Worth a watch. I'm excited to try this. youtube.com/watch?v=6uaqtc…

YouTube

English

3.9K

Hamel Husain retweetledi

Isaac Flath@isaac_flath·3d

I’m open sourcing agentkb, the system I use to let the agent see things I’ve learned or that it’s done before, so that it can do the same kinds of things quicker. Agentkb currently stores agent chats, X posts, wiki, and skills. isaacflath.com/writing/agentkb

English

2.1K

Hamel Husain retweetledi

Tibo@thsottiaux·2d

We are releasing a *research preview* of Chronicle in Codex. It allows codex to build up memories based on your day to day work on your computer and then refer to these memories to be a lot more helpful. Available for PRO subscriptions and on Mac to start. This is early and consumes quite a bit of tokens, but it has changed how I and many folks at OpenAI use Codex.

OpenAI Developers@OpenAIDevs

Last week, we released a preview of memories in Codex. Today, we’re expanding the experiment with Chronicle, which improves memories using recent screen context. Now, Codex can help with what you’ve been working on without you restating context.

English

235

151

2.6K

916.1K

Hamel Husain@HamelHusain·2d

@badlogicgames Are you a parent out of curiosity

English

Mario Zechner@badlogicgames·3d

hi, my name is mario, i'm building coding agents and other LLM bullshittery, and i think the parent is directionally right. kids + AI is a spectrum. handing them a sycophantic LLM unsupervised and uninstructed is on the "terrible fucking idea" end of that spectrum.

Justine Moore@venturetwins

I am so sad for this kid

English

480

38.3K

Hamel Husain@HamelHusain·3d

@lukcombinator It really is not the same at all. Cowork doesn't have as deep of an OS integration I suggest trying it (it might not be available in your geo) before jumping to conclusions

English

568

Lukas@lukcombinator·3d

@HamelHusain I do that with claude cowork - but happy for you working with Codex

English

610

Hamel Husain@HamelHusain·5d

Lots of people asking what’s so good about the new codex desktop computer use. Here’s 5 things that come to mind 1. operate Mac Apps without a great API: Slack, Google Sheets, Notes, IMessage without installing separate plugins. It instantly transforms all your apps into tools 2. If you need to operate your browser more visually it works really smoothly and fast (good for sites that are still human centric) 3. It uses its own cursor, keyboard etc so you can keep working. 4. Once you do any task once you can simply ask Codex to reflect on what it did and how it would accomplish the task next time with the benefit of hindsight and create a skill AND schedule an automation. It’s really nice that codex can just schedule and edit automations when asked! it’s very Claw like in this way. This last point is not computer use specific but is powerful when combined with computer use 5. The UI polish is insane: you get nice icons for any application you want to tag into computer use plus all the other built in new stuff like built in file viewer and browser so there is no context switching. So you can iterate really fast and not lose focus. Because of the polish it also feels nice and delightful to use.

Hamel Husain@HamelHusain

Seriously stop everything you are doing and use codex desktop app new computer use. Absolutely mind blowing

English

927

164.5K

Hamel Husain@HamelHusain·4d

I have very few notifications turned on but this guy's tweets is one of them, its a constant stream of the most useful tools

Chris Tate@ctatedev

Terminal automation + e2e testing solved Now as simple as snapshot, click, type: – wterm renders terminal-in-html, every cell in the a11y tree – agent-browser automates pages via the a11y tree Here's opencode in one browser driving Claude Code in another

English

649.9K

Hamel Husain retweetledi