Stephen Bay

21 posts

Stephen Bay banner
Stephen Bay

Stephen Bay

@StephenCBay

Working to make a difference

Kirkland, WA Katılım Temmuz 2011
61 Takip Edilen15 Takipçiler
Alex Finn
Alex Finn@AlexFinn·
This is the best news of 2026 Nvidia is going all in on open models. Something no other American AI company has had the balls to do Open source is the American way. Democratized, equal opportunity for all. Yet China has been dominating on this. No more Here's what this means: Nvidia, will spend BILLIONS on developing models you can run on local devices. Fully private, secure, and free In fact, today they released Nemotron 3, a 120b parameter model with 1 million token context window. Revolutionary. I'm running it on my DGX Spark as we speak You'll be able to fully run frontier level models on your desk, no APIs, no fees, no limits AI agents doing work for you 24/7 This is INCREDIBLE news Up until this point it's been all Chinese models. Now America has entered the race. I FULLY expect Apple to make similar announcements soon You need to prepare for this immediately: 1. Depending on your hardware, download an open source model that's appropriate for it 2. Ask your OpenClaw to do this for you 3. Run it, and use it for small tasks 4. Understand from a deep level how it works 5. Read books to get a deeper understanding of LLMs (I like Deep Learning by John D Kelleher) 6. Start building agents that utilize these open models It cannot be stated enough how incredible this is that Nvidia is entering the open source race America is open source. America is democratized, equal opportunity to success. And now we're finally acting like it 🇺🇸
Will Knight@willknight

Scoop from me: Nvidia will spend a total of $26 billion over the next five years building the world's best open source models. America is back in the open source AI race! wired.com/story/nvidia-i…

English
83
61
757
65.1K
Elon Musk
Elon Musk@elonmusk·
The Sun is an enormous, free fusion reactor in the sky. It is super dumb to make tiny fusion reactors on Earth. Even if you burned 4 Jupiters, the Sun would still round up to 100% of all power that will ever be produced in the solar system!! Stop wasting money on puny little reactors, unless actively acknowledging that they are just there for your pet science project jfc.
English
15.9K
14K
148.7K
33.2M
Robert Youssef
Robert Youssef@rryssf·
Here’s how ACE works 👇 It splits the model’s brain into 3 roles: Generator - runs the task Reflector - critiques what went right or wrong Curator - updates the context with only what matters Each loop adds delta updates small context changes that never overwrite old knowledge. It’s literally the first agent framework that grows its own prompt.
Robert Youssef tweet media
English
7
11
232
28.9K
Robert Youssef
Robert Youssef@rryssf·
RIP fine-tuning ☠️ This new Stanford paper just killed it. It’s called 'Agentic Context Engineering (ACE)' and it proves you can make models smarter without touching a single weight. Instead of retraining, ACE evolves the context itself. The model writes, reflects, and edits its own prompt over and over until it becomes a self-improving system. Think of it like the model keeping a growing notebook of what works. Each failure becomes a strategy. Each success becomes a rule. The results are absurd: +10.6% better than GPT-4–powered agents on AppWorld. +8.6% on finance reasoning. 86.9% lower cost and latency. No labels. Just feedback. Everyone’s been obsessed with “short, clean” prompts. ACE flips that. It builds long, detailed evolving playbooks that never forget. And it works because LLMs don’t want simplicity, they want *context density. If this scales, the next generation of AI won’t be “fine-tuned.” It’ll be self-tuned. We’re entering the era of living prompts.
Robert Youssef tweet media
English
238
1.2K
7.8K
714.8K
Stephen Bay
Stephen Bay@StephenCBay·
@emollick I thought prompting would be less important as models evolve but it feels like they are increasingly important.
English
0
0
0
96
Ethan Mollick
Ethan Mollick@emollick·
The exact wording of the prompt is not that important, it is the idea that you can use modern web-connected LLMs as solid first-pass fact checkers that is useful.
English
3
0
56
13.6K
Ethan Mollick
Ethan Mollick@emollick·
A really useful prompt for writing: "review this for accuracy, look up any facts you may want to challenge or explore." Even if not perfect, it is a good sanity check. Works well with Claude 4.1, GPT-5 Thinking, and Grok 4. Weirdly, Gemini 2.5 Pro often won't do web searches.
Ethan Mollick tweet media
English
23
51
587
49.3K
Andrej Karpathy
Andrej Karpathy@karpathy·
Scaling up RL is all the rage right now, I had a chat with a friend about it yesterday. I'm fairly certain RL will continue to yield more intermediate gains, but I also don't expect it to be the full story. RL is basically "hey this happened to go well (/poorly), let me slightly increase (/decrease) the probability of every action I took for the future". You get a lot more leverage from verifier functions than explicit supervision, this is great. But first, it looks suspicious asymptotically - once the tasks grow to be minutes/hours of interaction long, you're really going to do all that work just to learn a single scalar outcome at the very end, to directly weight the gradient? Beyond asymptotics and second, this doesn't feel like the human mechanism of improvement for majority of intelligence tasks. There's significantly more bits of supervision we extract per rollout via a review/reflect stage along the lines of "what went well? what didn't go so well? what should I try next time?" etc. and the lessons from this stage feel explicit, like a new string to be added to the system prompt for the future, optionally to be distilled into weights (/intuition) later a bit like sleep. In English, we say something becomes "second nature" via this process, and we're missing learning paradigms like this. The new Memory feature is maybe a primordial version of this in ChatGPT, though it is only used for customization not problem solving. Notice that there is no equivalent of this for e.g. Atari RL because there are no LLMs and no in-context learning in those domains. Example algorithm: given a task, do a few rollouts, stuff them all into one context window (along with the reward in each case), use a meta-prompt to review/reflect on what went well or not to obtain string "lesson", to be added to system prompt (or more generally modify the current lessons database). Many blanks to fill in, many tweaks possible, not obvious. Example of lesson: we know LLMs can't super easily see letters due to tokenization and can't super easily count inside the residual stream, hence 'r' in 'strawberry' being famously difficult. Claude system prompt had a "quick fix" patch - a string was added along the lines of "If the user asks you to count letters, first separate them by commas and increment an explicit counter each time and do the task like that". This string is the "lesson", explicitly instructing the model how to complete the counting task, except the question is how this might fall out from agentic practice, instead of it being hard-coded by an engineer, how can this be generalized, and how lessons can be distilled over time to not bloat context windows indefinitely. TLDR: RL will lead to more gains because when done well, it is a lot more leveraged, bitter-lesson-pilled, and superior to SFT. It doesn't feel like the full story, especially as rollout lengths continue to expand. There are more S curves to find beyond, possibly specific to LLMs and without analogues in game/robotics-like environments, which is exciting.
English
408
835
8.3K
1.1M
Andrew Panella
Andrew Panella@Longevity_EDU·
6. Her Simple Sleep Routine. 1. Never has caffeine 2. Walks 6-9K steps 3. Listens to calming music 4. Stretching 5. Washes face 6. Brushes teeth 7. Bed by 8 PM. Simple sleep routine = less stress and overwhelm.
English
3
9
111
18.5K
Andrew Panella
Andrew Panella@Longevity_EDU·
The 56-year old whose biological age is 36: Julie Clark. She´s outpacing Bryan Johnson, the $2 million/year biohacker, on just $4/day. Here´s her record-breaking longevity routine that´s almost too simple to believe (bookmark this): 🧵
Andrew Panella tweet media
English
78
323
2.1K
593.4K
Jesse Pujji
Jesse Pujji@jspujji·
Two years ago, I hired an agency to revamp my LinkedIn profile. Since then, I’ve seen 3x more inbound leads. I asked the founder if I could share what he taught about LinkedIn profile upgrades and he said ‘yes!’ So that’s exactly what I’m doing. For context: In February, @marvinsangines of Notus hosted this workshop for my Accelerator Program. His agency has optimized the Linkedin profiles of 150+ B2B founders & executives worldwide - including mine. In the 83-minute session, he took us through the exact workflow his agency uses to turn every client’s profile into a high-converting landing page. As a BONUS, he also shared a step-by-step Notion template anyone can use to do the same. If you want the full recording + the bonus template FOR FREE: Comment “LinkedIn” and I’ll send it to you.
GIF
English
1.2K
28
571
112.7K
God of Prompt
God of Prompt@godofprompt·
Most people don’t know how to research with AI. That’s why I built a Deep Researcher Mega-Prompt that turns ChatGPT, Perplexity, or Grok into your personal analyst. Just paste it in and it’ll generate expert-level research reports in seconds. Like + comment "Mega" and I’ll DM it to you. (Follow me so I can send the link)
God of Prompt tweet media
English
676
116
1.1K
217.5K
Jesse Pujji
Jesse Pujji@jspujji·
A 21 y/o marketer is generating 1M+ social impressions for me. He’s using a fine-tuned AI to turn my daily meetings into viral posts. I asked him to make me a 5 min Loom to show how he does it. RT + Comment "GrowthAssistant" to get the video in your DMs.
Jesse Pujji tweet media
English
1.1K
351
632
143K
Stephen Bay
Stephen Bay@StephenCBay·
@elonmusk is the 3rd row in the model Y safe for my 6 year old daughter. I can’t find much information on the safety of the 3rd row
English
0
0
0
4
Stephen Bay
Stephen Bay@StephenCBay·
@kirbywinfield Find the “No” as quickly as possible. Build as many relationships with stakeholders inside the org you’re selling to. Emotionally expect it to take 18 months
English
0
0
1
12
☔🔥☔
☔🔥☔@kirbywinfield·
asking for a founder
☔🔥☔ tweet media
English
7
0
4
994
NFL
NFL@NFL·
Seahawks plan to hire Ravens DC Mike Macdonald as their new head coach. (via @RapSheet)
NFL tweet media
English
488
2K
19.6K
7.6M
Matt Turck
Matt Turck@mattturck·
VCs tweeting about how to operate a startup
English
88
242
2.4K
475.2K