Dr. Daniel Bender

Wolfram Ravenwolf@WolframRvnwlf

2

17

3K

Dr. Daniel Bender@drdanielbender·8h

The AI event in Cologne mentioned at the end of the space: x.com/WolframRvnwlf/…

I'm speaking at AIDev 6 in Cologne on 2 June about WolfBench.ai and why one score is not enough for evaluating AI agents. Agent performance depends on more than the model: harnesses, tools, task design, reliability, and real-world failure modes matter. A leaderboard number alone won't tell you whether an agent will actually survive contact with production. Excited to discuss practical agent evals – and to hear @jphme on secure online agent deployment. Registration is free but limited. Link in comments.

English

2

58

Dr. Daniel Bender@drdanielbender·13h

Looking forward to talk later today to my AI buddies and everyone else who is interested in these topics. 👇 x.com/i/spaces/1RJjp…

English

thinkingmachines.ai/blog/interacti…

0

5

122

Dr. Daniel Bender@drdanielbender·10h

ZXX

Google DeepMind@GoogleDeepMind

38

Dr. Daniel Bender@drdanielbender·10h

The Google I/O demo at creating a stylized video from a video input and guiding images looked outstanding. It was stated that the model is available from today world-wide in the Gemini App. So far, it is not available for me in Germany. 🥲

We’re dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video. It combines Gemini’s intelligence with our generative media systems - representing a leap forward in world understanding, multimodality, and editing 🧵

English

181

Dr. Daniel Bender@drdanielbender·13h

That is a big win for Anthropic. Congrats to them for hiring one the most recognized AI masterminds!

Andrej Karpathy@karpathy

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

2

84

Dr. Daniel Bender@drdanielbender·13h

@karpathy @IrenaCronin Super nice, looking forward to what you will push forward. That is a big win for Anthropic. Congrats to them for hiring one the most recognized AI masterminds!

English

67

Andrej Karpathy@karpathy·14h

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

7.1K

9.8K

128.4K

18.4M

Dr. Daniel Bender@drdanielbender·1d

@NousResearch Is the video created with the Hermus video skill? Looks amazing! How much work was it to create the video?

English

65

Nous Research@NousResearch·2d

Hermes Agent v0.14.0 - “The Foundation Release” Changelog below

English

219

439

4.4K

575.9K

Dr. Daniel Bender@drdanielbender·5d

The hard question is no longer how to build it. It’s whether we should build it. For teams, agreeing on what to build is becoming the bottleneck in the age of AI agents. For solo developers, that bottleneck is smaller. You can just decide. That might be one of the underrated advantages of building alone right now.

GIF

English

RobbiewOnline@RobbiewOnline

3

152

Dr. Daniel Bender@drdanielbender·6d

Local-first AI developer book released by @RobbiewOnline 👇 Sounds interesting, but as much as I love local AI models, for coding I still prefer to work with the best available model (which are so far models running in the cloud) as a single mistake can cost you hours in time and millions in tokens.

After playing with AI for a few months via OpenClaw I evolved to focus more on local AI, principally to save costs but other benefits included increasing privacy (protecting IP) and to have a fallback for when cloud models are simply broken. I then decided to write a book to share my findings and it's just gone live on Amazon!! I've kept it as cheap as possible - there are too many people trying to make money with AI rather than share genuine experiences to benefit others. The only bit left is to fix Amazon accidentally merging another authors bio as mine!

English

0

2

289

Dr. Daniel Bender@drdanielbender·6d

@Scobleizer As much as I love local AI, for coding I still prefer to work with the best available model as a single mistake can cost you hours in time and millions in tokens.

English

RobbiewOnline@RobbiewOnline

0

2

123

Robert Scoble@Scobleizer·13 May

If you get through this book your nerd score will go up 10x.

After playing with AI for a few months via OpenClaw I evolved to focus more on local AI, principally to save costs but other benefits included increasing privacy (protecting IP) and to have a fallback for when cloud models are simply broken. I then decided to write a book to share my findings and it's just gone live on Amazon!! I've kept it as cheap as possible - there are too many people trying to make money with AI rather than share genuine experiences to benefit others. The only bit left is to fix Amazon accidentally merging another authors bio as mine!

English

9

6

72

10.6K

Dr. Daniel Bender@drdanielbender·6d

If you enjoy working and playing with AI models - and you're near Cologne, Germany - AIDEV 6 on June 2nd is the place to be!👇 I still have great memories of my first AIDEV 3. It was where I met @WolframRvnwlf and @jtdavies in person for the first time, and we've stayed in touch ever since. I will be there and can't wait for another great event!

Wolfram Ravenwolf@WolframRvnwlf

I'm speaking at AIDev 6 in Cologne on 2 June about WolfBench.ai and why one score is not enough for evaluating AI agents. Agent performance depends on more than the model: harnesses, tools, task design, reliability, and real-world failure modes matter. A leaderboard number alone won't tell you whether an agent will actually survive contact with production. Excited to discuss practical agent evals – and to hear @jphme on secure online agent deployment. Registration is free but limited. Link in comments.

English

1

4

336

Dr. Daniel Bender@drdanielbender·6d

@WolframRvnwlf Thats cool! I will be there and look forward to getting the latest insights from your evaluations.

English

1

28

Wolfram Ravenwolf@WolframRvnwlf·11 May

I'm speaking at AIDev 6 in Cologne on 2 June about WolfBench.ai and why one score is not enough for evaluating AI agents. Agent performance depends on more than the model: harnesses, tools, task design, reliability, and real-world failure modes matter. A leaderboard number alone won't tell you whether an agent will actually survive contact with production. Excited to discuss practical agent evals – and to hear @jphme on secure online agent deployment. Registration is free but limited. Link in comments.

English

0

6

708

Dr. Daniel Bender@drdanielbender·12 May

Prepare to get organized. Personal AI gets practical when it takes the chaos you already have in your inboxes, notes, documents, and photos and turns it into the next actions that actually matter. The best version of this is agent-first and tool-agnostic.

GIF

English

2

155

Dr. Daniel Bender retweetledi

Wolfram Ravenwolf@WolframRvnwlf·29 Nis

There's FOMO: Fear Of Missing Out. There's FOMAT: Fear Of Missing Agent Time. And then there's FOMUT - the next level of agent neurosis: Fear Of Missing Unused Tokens. > That moment when the limit resets while unused tokens remain - and a precious resource simply evaporates. 💨

English

8

504

Dr. Daniel Bender@drdanielbender·5 May

Looking forward to talk later today to my AI buddies and everyone else who is interested in these topics. 👇 x.com/i/spaces/1rxmq…

English

5

0

4

333

Dr. Daniel Bender@drdanielbender·3 May

@JeremyNguyenPhD I have two Hermes agents and one of them is running only local models. Currently I am using Qwen 3.6 27B, but the models are evolving so fast that this will likely change quite soon again.

English

1

30

Jeremy Nguyen ✍🏼 🚢@JeremyNguyenPhD·2 May

@drdanielbender Is your Hermes agent powered by local models, Daniel?

English

0

4

64

Dr. Daniel Bender@drdanielbender·2 May

OpenAI just announced they are shutting down 20+ models between July and October. GPT-4, GPT-4o, o1, o3-mini, and many others will stop working. If you built anything on these, you have three months to rewrite and re-test everything. This is what building on quicksand looks like.

GIF

English

1

2

292

Dr. Daniel Bender@drdanielbender·3 May

@Tance_Essence @omarsar0 You can already do some quite helpful stuff with a Mac with +16 GB of system memory. But yes, with a high end computer you would be able to run better models locally.

English

24

Uchechi@Tance_Essence·2 May

@drdanielbender @omarsar0 Would you need like a super computer to do that???

English

0

1

10

elvis@omarsar0·1 May

You don't have to choose between either. It's best to use a combination of them. My advice is to learn how to use a few of these models in different harnesses. Learn to combine their strengths. Open-weight models are just as good these days. Give yourself the flexibility.

English

16

6

40

7.2K

Dr. Daniel Bender@drdanielbender·2 May

If you're using Hermes Agent, take a closer look at the built-in tools and skills. There are 100+ built in, and you definitely won't need all of them. Deactivate what you don't use with `hermes skills config` and `hermes tools list/deactivate`.

English

1

176

Dr. Daniel Bender@drdanielbender·2 May

@icreatelife Fully agree, Kris! I just try to share the message that there are privacy-friendly options if needed. But yeah, it is crazy what is possible today, be it with cloud or local models.

English