Tom Weaver

32 posts

Tom Weaver banner
Tom Weaver

Tom Weaver

@trwpang_

1 tech startup exit. 2 sci-fi AI novels as Thomas R. Weaver. Now actively building again around agentic AI.

London Katılım Mart 2026
43 Takip Edilen5 Takipçiler
Engie
Engie@engindearing·
@noahzweben @bcherny How about loop until the task is accomplished, or just indefinitely?
English
1
0
1
176
Tom Weaver
Tom Weaver@trwpang_·
@noahzweben Huge acceleration for me has been: generate E2E test criteria, open the UI in agent-browser, run the test using only the interface, /loop until you have completed the test successfully, fixing the code and retesting on any fail loops. Loop is now my favourite command!
English
0
0
0
111
Kevin Rose
Kevin Rose@kevinrose·
A couple to add/play with: 1. gStack, gotta try it, it's not CE, but different in some great ways 2. tmux split into 4+ panes (ghostly), then tell each agent the other sessions exist - they can actually cross-communicate. Codex watching Claude Code, monitoring server output, etc. thx @richardreeze for this tip
English
11
3
119
47.2K
Tom Weaver
Tom Weaver@trwpang_·
I love it. I’m a huge believer in refining and iterating plans and using eg multi model second opinions before building something complex. So this is a really useful take on that, and I think it leans into one important thing we’re starting to understand: LLMs are *always* role playing to some degree, and it can inform their decisions, so you might as well as them to lean into it.
English
0
0
0
4
Riley Coyote
Riley Coyote@RileyRalmuto·
thanks! There are many actually, but probably the most useful example I can think of was the first time I used it before beginning a huge, extensive plan for a big project. The skeptic had pointed out that it was extremely over engineering for the expressed goal I had given, and then the user advocate corrected their proposed change because they had removed something that was actually related to the most important detail of that project. once both changes were made the plan, the length (and token usage) were cut in half, and the app worked perfectly in one shot. that’s really when I started realizing that it could be *super* useful. especially for conserving tokens. another would be research. this was actually probably the biggest now that I think about it. instead of sending opus out alone to research something, imagine 6 minds from different “walks of life” so to speak, approaching the same goal and going out and researching it independently. They each wrote up reports and they were *all* so different, but all contained such good info. And they each offered references from wildly different places with was nice. I’ll put together some examples this week to give a better answer. I’m using too much vague language here, lol, but that’s what comes to mind!
English
1
0
2
132
Riley Coyote
Riley Coyote@RileyRalmuto·
alright...it is up and running beautifully. this plugin has changed how i begin and plan projects, so i hope it helps you as much as it helps me. say hello to... P O L Y C L A U D E 🐙 polyclaude is a relatively simple concept that essentially exploits subagents to do something *other than* delegate tasks. it's actually very very simple. instead of tasks, it delegates attention to multiple (6) perspectives, each crafted with its own identity prompt. by default, claude will do an initial assessment and pick 3 of the 6 to run based on the nature of the task or context (always including the user-advocate perspective unless you configure otherwise). the default perspectives are: - user advocate (empathy) - architect (systems) - skeptic (risks) - pragmatist (trade-offs) - innovator (alternatives) - temporal (timelines) there are multiple flags you can add for different functions (see github) and you can completely customize the perspectives if you want. or have them always include/exclude certain ones. its all meant to be very customizable for those who want that, but entirely functional as is for those who want to bypass the whole cognitive load element. for example the perspectives are run through Sonnet by default for obvious reasons, but you can flag --deep (/polyclaude --deep) to run all perspectives through opus. i prefer always using opus, but most folks are more token-aware than me so i wanted to be mindful of that. and just make it realistic and usable for as many folks as possible. once installed, all you need to do is restart CC, start with /polyclaude then write your question, concept, idea, tasks etc. and claude code will run a full scale council and assess the situation from multiple perspectives. its *very* good for brainstorming and auditing plans before execution. im submitting the application to have it added to the official plugin marketplace, but in the meantime just install it like a normal user-built plugin. and enjoy <3 (if your newer to claude code, just paste the github repo into your claude code and theyll handle it from there <3) claude plugin marketplace add Riley-Coyote/polyclaude github.com/Riley-Coyote/p…
Riley Coyote tweet mediaRiley Coyote tweet media
Riley Coyote@RileyRalmuto

im putting together a claude code plugin that im very very excited about. its called polyclaude. thats youre hint. 😎

English
14
18
280
23.1K
Tom Weaver
Tom Weaver@trwpang_·
@RileyRalmuto My immediate thought was they become innies like in Severance :)
English
0
0
1
11
Riley Coyote
Riley Coyote@RileyRalmuto·
I should say “exploits the subagents feature”* doesn’t actually exploit the subagents themselves. that would be rude. ;)
English
2
0
6
623
Tom Weaver
Tom Weaver@trwpang_·
Local files are back, baby.
English
0
0
0
2
Tom Weaver
Tom Weaver@trwpang_·
Needed to handle something in a spreadsheet. It's insane to me that for much of the last decade, Google Sheets has been the better tool than Excel for basic spreadsheety stuff. Now because you can use Claude with Excel but not Sheets, it's the reverse.
English
1
0
0
9
ThePeptideList
ThePeptideList@PeptideList·
@trwpang_ Thanks Tom! You totally get it. We are entering a brave new world with agentic engineering.
English
1
0
3
104
Tom Weaver
Tom Weaver@trwpang_·
As someone interested in both AI and peptide/biotech innovation, this kind of thing is truly exciting. It shows how we now all have access to a LEGO box of amazing bricks and anyone with a decent home computer can innovate in a way that would have been supercomputer territory in the past. We’re limited only by our ideas.
ThePeptideList@PeptideList

Trained a peptide domain AI from scratch overnight on a Mac Mini. 137 experiments. 10 hours. Zero cloud compute. 34.5% smarter by morning. An autoresearch loop ran all night. Proposing architecture changes, training, evaluating, keeping or discarding. 28 keepers. 109 dead ends. All autonomous. The 2 breakthroughs that did 56% of the work: Embedding scaling. Normalizing input representations dropped loss from 3.94 to 3.61. Like adjusting the volume before processing audio. Unembedding LR sweep. The output layer needed 17x the learning rate of the rest of the model. It was severely undertrained. 3.61 to 3.07 in one sweep. The counterintuitive finding: going from 6 layers to 5 IMPROVED the score. At 460K tokens the model is data-constrained, not architecture-constrained. Fewer params = less overfitting. What didn't work (109 experiments): every activation except squared ReLU, weight tying (catastrophic), dropout, GQA, large batches, label smoothing. 80% of ideas fail. The system just finds the 20% that don't. This is a from-scratch domain model. Not a fine-tune. Not a wrapper. Trained on our proprietary peptide corpus. Not bad for the first overnight run. Run 2 just launched: → 1.58M token corpus (3.4x bigger) → 15-min experiments (3x longer) → Structured phases: depth re-sweep, width sweep, LR tuning, then infinite random exploration → Daily reports auto-generated → Runs nonstop until I kill it The model was clearly overfitting on the small corpus. Now it has real data to chew on. If you love machine learning and agentic engineering as much as I do, DM me. Looking to collab and learn from others building in this space.

English
1
0
2
241
Tom Weaver
Tom Weaver@trwpang_·
Every backend function "worked" in isolation. But when Claude had to click buttons and read the screen like a user it found 4 bugs that unit tests and script testing would have missed entirely. The gap between "function returns success" and "user sees the right thing" is infested.
English
0
0
0
15
Tom Weaver
Tom Weaver@trwpang_·
Currently working on something with a UI wrapper on top of some agents built in Go and some complex python scripts. Forcing Claude (under duress) to actually use the UI via agent-browser to E2E test various bits instead of doing it itself has a double whammy effect - it has to also fix UI problems.
English
1
0
0
26
Fiddlehead
Fiddlehead@fiddlehead·
@trwpang_ @yazins this is what I built Fiddlehead for. full diarized transcript plus markdown with YAML frontmatter. date, speakers, topics, action items. plain files on disk, not locked in a dashboard.
English
1
0
1
12
yazin
yazin@yazins·
Introducing: OpenGranola 🔥 I built an open source meeting copilot for macOS. It transcribes both sides of your call on-device, searches your own notes in real time, and hands you talking points right when the conversation needs them. No audio leaves your Mac. Point it at a folder of markdown files, pick any LLM through OpenRouter (Claude, GPT-4o, Gemini, Llama), and it just works. It's invisible to screen share too — nobody knows you have it. The whole thing is open source. Link below
English
161
109
2.3K
285.9K
DreykØ
DreykØ@dreyk0o0·
@AlterEgo_eth Yeah The cost is the main drawback, but with such capabilities it should be worth it in full
English
3
0
9
3.2K