Nadeem

117 posts

Nadeem

@8W7O7

Busy bee. At least 50% notes to self. Made some things at Meta/Uber/Wayfair

London Katılım Mart 2014

269 Takip Edilen21 Takipçiler

Nadeem@8W7O7·2h

@emilkowalski @joshpuckett You should deff set up something like @threejsJourney's monthly jams for enrollers. I'd love to see what people create with the course (+ gets you more visibility)

English

Emil Kowalski@emilkowalski·1d

Each enrollment of animations.dev comes with free updates for existing students. This time it’s a new lesson on training your judgement with more than 20 exercises, and a guest lesson by @joshpuckett called “Animations as Proof of Care”. You still have 2 days to join!

English

543

54.3K

Nadeem@8W7O7·5h

@omarsar0 Maybe if we had the infinite list of tasks we're to ever ask a model to solve. I just ask a model to build a harness for the project and self-evolve over time. Use a teacher model to test it on a range of project-relevant tasks and help course correct when needed.

English

elvis@omarsar0·1d

i haven't seen a model that just works across agent harnesses. seems like it should exist. great opportunity for open-weight models. any thoughts?

English

9.6K

Nadeem@8W7O7·1d

@steveruizok for the greater good

English

Steve Ruiz@steveruizok·1d

I think it's time we provide researchers with training data. For now all of the data is going to be mine personally.

English

1.1K

Steve Ruiz@steveruizok·1d

the beginning of something new: huggingface.co/datasets/steve…

English

5.5K

Nadeem@8W7O7·3d

@ben_burtenshaw @_lewtun @willccbb Hey would love to learn how HF's own products/teams benefitted from RL agents/loops internally? And what went into deciding if it's worth the time, cost, risk against other existing now mostly mature alternatives/MLOps workflows.

English

369

Ben Burtenshaw@ben_burtenshaw·3d

2 days until we will hold this deep dive workshop on everything RL for agents with some of the best names in the game. Objective: understand what it takes to train agents in the open. Speakers include: - @_lewtun giving a crash course on RL beyond language - @willccbb defining the core bottlenecks in open source rl tools. - @OfirPress is going to ground the whole thing with insights from SWE-Bench-* - @a1zhang is going to come at it with recursive language models, and show us what's possible at the harness level. Drop your hard questions in this thread, and I'll raise them in the session. or come along and just heckle.

English

251

37.8K

Nadeem retweetledi

Natasha Jaques@natashajaques·20 Mar

The paper I’ve been most obsessed with lately is finally out: nbcnews.com/tech/tech-news…! Check out this beautiful plot: it shows how much LLMs distort human writing when making edits, compared to how humans would revise the same content. We take a dataset of human-written essays from 2021, before the release of ChatGPT. We compare how people revise draft v1 -> v2 given expert feedback, with how an LLM revises the same v1 given the same feedback. This enables a counterfactual comparison: how much does the LLM alter the essay compared to what the human was originally intending to write? We find LLMs consistently induce massive distortions, even changing the actual meaning and conclusions argued for.

English

391

1.5K

252.8K

Nadeem@8W7O7·4d

@handsdiff @NousResearch @ssh_exe_dev The VM solution sounds great, did you discover any other cheaper alternatives (that you could manage yourself on a single VPS?). How are you also handling secret management? I'm using infisical but it's still quite a hacky wrapper around hermes.

English

hands@handsdiff·4d

I am putting @NousResearch hermes agents on @ssh_exe_dev containers, giving them good secret management, making them public, and letting it rip.

English

127

Nadeem@8W7O7·5d

@p44v9n Nice! Could you add a rss feed?

English

100

Paavan ❖@p44v9n·5d

lots of events for designers in London this weeken

English

154

8.7K

Nadeem@8W7O7·5d

@claudeai @figma and something that we need to see more often x.com/_catwu/status/…

cat@_catwu

3/ Revisit features with new models Every model release, go back through your list of features that were too hard for the previous model and test the ideas again. Also, remove the extra scaffolding that is no longer needed.

English

Nadeem@8W7O7·5d

@claudeai @figma a good adjacent read alignment.anthropic.com/2026/psm/

English

Nadeem@8W7O7·5d

Enjoying @claudeai Design. When you can own the model layer, seems like you get more leverage over the traits of the persona, enough so that the system feels closer to havin consistent expert design judgment built-in vs something like @figma's harness that insists a persona.

English

Nadeem@8W7O7·6d

@IanArawjo This is great, i've been struggling to gather confidence in benchmarks w.r.t. real world uses and needed to get back to basics.

English

Ian Arawjo@IanArawjo·6d

Thanks to a colleague's suggestion, promptstats is now renamed to "evalstats." This renaming better conveys the breadth of the package—aiming to be the go-to resource for everything statistics and AI evaluation, whether that's comparing models or prompts. github.com/ianarawjo/eval…

English

600

Nadeem@8W7O7·15 Nis

@mr_r0b0t @NousResearch @TencentAI_News please support this as well @Hetzner_Online

English

mr-r0b0t@mr_r0b0t·14 Nis

Big news for @NousResearch users wanting to run Hermes in a hosted environment! Very cool @TencentAI_News!

Tencent AI@TencentAI_News

For anyone running @NousResearch Hermes Agent locally and wishing it just stayed online: there's now a one-click deployment template on Tencent Lighthouse. Cloud-hosted, sandboxed from your local env, online around the clock, reach it through WhatsApp, Telegram, WeCom, QQ, or other messaging channels. Also: the QQBot plugin is now merged into Hermes Agent's official repo. Just pick QQBot under Messaging Platforms config and you're set.

English

4.5K

Nadeem retweetledi

Ben Lang@benln·25 Mar

Best career hack is to make sure you’re the person in the room who's always having fun.

English

101

275

3.8K

115.5K

Nadeem@8W7O7·14 Nis

@zqwq333 @IanOsband @_rockt @wgaluba @DGneusheva @modic123 @george__wing @aiengine_hack Really looking forward to this. Signed up but when are you guys approving folks?

English

Zoe Qin@zqwq333·14 Nis

For an in depth discussion on RL: IRL with @IanOsband @_rockt @wgaluba Armin Cc: @DGneusheva @modic123 @george__wing luma.com/4acss683

English

534

Nadeem@8W7O7·14 Nis

@JakeHulberg @NousResearch @infisical This isnt particularly great but does the job I need as I already to fine tune and rotate my secrets outside infisical anyways. Would love any advice on how to improve this (havent read the docs yet). I'm no cybersec.

English

Nadeem@8W7O7·14 Nis

@JakeHulberg @NousResearch @infisical It's hacky rn, a startup script authenticates with machine credentials, then wraps the hermes process, secrets get injected as env vars. Systemd keeps it alive. and restarts it to pick up new secrets. Looking into a subprocess/subagent wrapper approach atm.

English

Nadeem@8W7O7·14 Nis

@NousResearch's hermes-agent has been good help with streamlining my research tasks this past weekend. Would have saved me a lot of setup time if they had some sort of out of the box VPS config and secret management provider support.

English

Nadeem retweetledi

blue@bluewmist·12 Nis

the best thing you can do for yourself is actively increase your surface area for luck to hit you. go outside, try new cafes, museums, events, take a new route home, speak to people, ask questions, side quest. the more you do, the more serendipity and synchronicity will find you.

English

140

6.1K

48.7K

587.3K

Nadeem@8W7O7·10 Nis

@mervenoyann @GoogleDeepMind one of my fav demos. This would be so good at parties.

English

204