Patrick Srail

1.7K posts

Patrick Srail

Patrick Srail

@patricksrail

Product, AI @ Airbnb. ex-Amazon (OG Alexa + new hardware), Hulu, Myspace, CBS, and Sony.

Los Angeles, CA Katılım Ocak 2009
895 Takip Edilen4.9K Takipçiler
Andrew Weiss
Andrew Weiss@BayWestInvest·
Something in Los Angeles is functioning… properly?? I’ve lived here for 14 years and this might be the first such occurrence. (would like to see the wait time to drive thru the U at LAX but I digress)
Andrew Weiss tweet mediaAndrew Weiss tweet media
English
19
6
277
26.9K
Patrick Srail
Patrick Srail@patricksrail·
@CoachDanGo Grew up in Germany. Same. We had “house shoes” for when it got cold.
English
0
0
0
60
Patrick Srail
Patrick Srail@patricksrail·
@NickADobos @itsolelehmann It works but just is not a good DX especially if you use GitHub and work on same repo across machines w different configs
English
0
0
0
16
Ole Lehmann
Ole Lehmann@itsolelehmann·
anthropic should add a simple feature to sync skills between claude chat, claude cowork and claude code and between teams i see how much people are struggling with this
English
114
28
955
54.3K
Patrick Srail
Patrick Srail@patricksrail·
@NickADobos I was reading through the docs the other day and noticed this. There are so many other goodies in there, like the ability to define hooks for a specific skill (!!).
English
0
0
2
1.1K
Mikhail Parakhin
Mikhail Parakhin@MParakhin·
Codex should really allow GPT-5.4 Pro model to be used, at least as an 'exec'ed subagent. Autoresearch loop is here, but right now the lack of the top models pushes us (ML people) to use Pi + Pro/DeepThink API.
Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English
27
17
419
63.5K
Ryan Michael
Ryan Michael@theblaqroom_·
@maxtoscano1 Somebody's gonna tear a shoulder in there doing lateral raises with 40s lol
English
20
9
3.5K
76.5K
Patrick Srail retweetledi
Nandkishor
Nandkishor@devops_nk·
90% of stand-up meetings look exactly like this.
English
126
970
6.5K
748.6K
Thariq
Thariq@trq212·
We just added /btw to Claude Code! Use it to have side chain conversations while Claude is working.
English
1.2K
1.6K
26K
2.7M
Patrick Srail
Patrick Srail@patricksrail·
@claudeai I like that the videos are starting to feature the desktop app more.
English
0
0
0
88
Claude
Claude@claudeai·
Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.
English
2.1K
5.2K
62.9K
22.7M
Patrick Srail
Patrick Srail@patricksrail·
@lenadroid Oh man, my 6yo gonna be mad now. What’s fascinating: after mine vibe coded a game he played it for an hour, adding new features any time he got bored. Would never be ok with that much screen time - but he was building, and it was fascinating to watch, so here we are.
English
0
0
3
637
Lena Hall 🔜 KubeCon EU
Lena Hall 🔜 KubeCon EU@lenadroid·
This is wild 🤯 My 5-year-old just built his FIRST GAME ever with Codex using only dictation mode. He wanted a game where you type in a word and transform into any character you want to be ✨ so he dictated it to Codex. Then he upgraded it to a game where that character jumps and collects coins. Hello World to the youngest AI engineer ever❤️‍🔥 @OpenAI @OpenAIDevs Codex needs an interactive voice mode ASAP
Lena Hall 🔜 KubeCon EU tweet mediaLena Hall 🔜 KubeCon EU tweet media
English
82
57
918
92.2K
Tom Hosiawa
Tom Hosiawa@thosiawa·
@summeryue0 @petergyang @AmpCode's handoff feature lets you intentionally manage compaction behaviour. Maybe @openclaw could do something similar to amp (if they're not already) I added on top of their idea by defining what to store and read for a skill (i call it session blocks instead of handoff)
Tom Hosiawa tweet media
English
1
0
9
5.3K
Summer Yue
Summer Yue@summeryue0·
Nothing humbles you like telling your OpenClaw “confirm before acting” and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.
Summer Yue tweet mediaSummer Yue tweet mediaSummer Yue tweet media
English
2.4K
1.7K
17.5K
10M
Gabe Monroy
Gabe Monroy@gabemonroy·
@petergyang Workday MCP server is in the hands of ~20 customers today in limited availability. Going GA soon. And yes, using it via Anthropic Cowork works as you’d expect. 😀
English
3
0
2
349
Peter Yang
Peter Yang@petergyang·
It's funny to see all the companies with the best UI build MCP servers first - Mercury, Linear, Figma, etc. What we all need in life is WORKDAY MCP
English
23
3
139
13.5K
Patrick Srail
Patrick Srail@patricksrail·
@steipete consulting firms planning to offer 18 month implementation as part of digital transformation strategy
English
0
0
0
238
Peter Steinberger 🦞
Peter Steinberger 🦞@steipete·
We should make EnterpriseClaw just for the lolz. Java 21, Spring Boot, 14 abstract factory beans, 2GB Docker image, takes 45 seconds to start, AbstractSingletonProxyFactoryAgentClawResponseHandlerBeanDefinitionRegistryPostProcessorImpl .java
English
410
226
6.3K
583.8K
Patrick Srail
Patrick Srail@patricksrail·
@petergyang Use agent-browser skill. Uses playwright but not the token-heavy MCP. Can also maintain state which is luxurious.
English
0
0
3
194
Peter Yang
Peter Yang@petergyang·
I think the best MCP is probably Playwright MCP but had to turn it off too. 200K token context window makes most MCPs unusable imo.
English
97
4
223
39.7K
Guillermo Rauch
Guillermo Rauch@rauchg·
We're now seeing 550+ skills added every hour to skills.sh. Pretty wild. We've added more CLI tools and options on the website for improved search and discoverability. Run 𝚗𝚙𝚡 𝚜𝚔𝚒𝚕𝚕𝚜@𝚕𝚊𝚝𝚎𝚜𝚝 to start:
Guillermo Rauch tweet media
Vercel Developers@vercel_dev

npx skills@1.1.1 has new human and agent-friendly discovery features, and is now open source. • 𝚗𝚙𝚡 𝚜𝚔𝚒𝚕𝚕𝚜 𝚏𝚒𝚗𝚍 • 𝚗𝚙𝚡 𝚜𝚔𝚒𝚕𝚕𝚜 𝚞𝚙𝚍𝚊𝚝𝚎 • Let agents explore with 𝚏𝚒𝚗𝚍-𝚜𝚔𝚒𝚕𝚕𝚜 • Replaces 𝚗𝚙𝚡 𝚊𝚍𝚍-𝚜𝚔𝚒𝚕𝚕 vercel.com/changelog/skil…

English
91
112
1.4K
176.4K
Armin
Armin@itsarminbabaei·
@aisdk Sweet, how about ai-elements?
English
1
0
2
1.1K
Patrick Srail
Patrick Srail@patricksrail·
@trq212 Would love if versioning were a first class citizen, maybe in the frontmatter. Especially when installing skills off GitHub repos it’s hard to tell when there was an update.
English
0
0
0
70
Patrick Srail
Patrick Srail@patricksrail·
@bcherny Namespacing! Manually invoking from a large number is tedious. Namespacing seems to work with commands but not skills unless in a plugin?
English
0
0
0
33
am.will
am.will@LLMJunky·
Holy sh*t! This is cracked. I just ran this skill in my repo with the following prompt: 'Make me a flash promo video for CodexSkills that shows installing the skills and then highlights all the skills available.' And it came up with this without any further prompting. 🤯 Are you kidding me?
Remotion@Remotion

Remotion now has Agent Skills - make videos just with Claude Code! $ npx skills add remotion-dev/skills This animation was created just by prompting 👇

English
8
9
272
48.2K