
Sam Wasserman🦞
1.8K posts

Sam Wasserman🦞
@SamJWasserman
Emmy-Winning Filmmaker turned Creative Technologist, Founder & AI Systems Architect.


hermesbench v0.1 — a benchmark purpose-built for Hermes Agent tool-calling Runs real Hermes Agent subprocess in an isolated tmux session with full tool access. 48 tasks across 11 families: terminal, file read, patch edit, search, write, process mgmt, todo planning, execute code, web lookup, memory facts, error recovery. What sets it apart from BFCL, τ-bench, Terminal-Bench, and other agent evals: • Real harness — not a synthetic API or simplified sandbox. The model runs inside the actual Hermes Agent with its full 35-tool surface, same prompt format, same constraints. • Deterministic verifiers only — every task evaluates pass/fail via stdlib Python checks on the conversation trace and filesystem state. No LLM-as-judge. No flaky heuristics. • Full traces with token IDs — every system/user/assistant/tool message captured and exportable as loss-masked SFT training data. • Hardware telemetry — 5 Hz logging of GPU power, temperature, joules-per-token, and thermal throttle seconds per task. • Replayable terminal recordings — every task produces a .cast file you can replay or render to GIF/MP4. First results: nex-agi/nex-n2-pro:free → 32/48 (66.7%). Flawless on terminal smoke (5/5) and file read (6/6). Struggled on patch edit (1/5), write (2/5), and memory (1/3) — the model frequently fell back to wrong tools or skipped required calls. github.com/am423/hermesbe…






YC is absolutely YC for 35+ founders. We fund many founders in that bracket every batch and will continue to. If you want to build a huge fast-growing company alongside a peer group of the most ambitious people in the planet, you should apply to YC regardless of age. If your team seems awesome and serious we will want to talk to you.

Midjourney will be announcing its first hardware project tomorrow (Wednesday 6/17) at 6pm PT. Stay tuned for a livestream of our in-person launch event in San Francisco. If you're in town and want an invite, reply below, we have just a few slots left.



Let me show you how you can win $2.5M to fund your dream film. I originally made this trailer for the XPRIZE competition. Making this actually led to my film 'Nexus' being funded! You can steal my entire playbook to get your optimistic sci-fi film greenlit via XPRIZE 👇🧵



Guillermo del Toro says AI is a form of "natural stupidity" “We are on the verge of image illiteracy. We are on the verge of cinema illiteracy... The pact between man and image is sacred, but we are in a time when that is in danger... We are told images can be generated by artificial means. The existence of an image is not just to be there. It is to connect us, to make us feel beauty,” he said. wp.me/pc8uak-1lHpzQ




@crystalwizard only excuse I could justify for this would be if using the live translation functionality both ways, other than that let's speak to one another since we are in front of one another. I agree.



When your friends are wearing AirPods don't you tell them to take them off already? I do. Presence demands such. Now if we are making a show then I'll put mine on too. But a lot of times I'm wearing them while puttering around, or walking around San Francisco. There a camera is helpful.


Hollywood is at risk of becoming Detroit, advocates warn, unless the U.S. responds to the 81 countries embracing filmmaking as an economic tool. “I watched the demise of steel and rubber and automotive manufacturing as I grew up,” says IATSE vice president Mike Miller, who was raised in Cleveland. “This is identical in many ways. We have an undeclared trade war that our government is standing by and watching happen.” Read the full cover story on the mass exodus of LA productions by @GeneMaddaus: wp.me/pc8uak-1lHp4O




