Aaron
1.7K posts

Aaron
@DataVisGuy
Father of 3 | Cybersecurity | AI | Education | Data Vis
Katılım Mayıs 2023
170 Takip Edilen273 Takipçiler

Inspired by @karpathy's practice of tinkering with smaller GPTs to gain intuition, I've been training small neural nets to play this simple 2d game I created. On the left is a visualization of the neural net firing in real time. Built with @claudeai, @threejs, and @WebGPU and using JAX from @GoogleDeepMind. I’m piping the neuron activation into @Ableton as MIDI to trigger Serum 2.
English
Aaron retweetledi

// Self-Evolving Agent Protocol //
One of the more interesting papers I read this week.
(bookmark it if you are an AI dev)
The paper introduces Autogenesis, a self-evolving agent protocol where agents identify their own capability gaps, generate candidate improvements, validate them through testing, and integrate what works back into their own operational framework.
No retraining, no human patching, just an ongoing loop of assessment, proposal, validation, and integration.
Why it's worth reading this paper:
Static agents age quickly.
As deployment environments change and new tools arrive, the agents that survive will be the ones that can safely rewrite themselves. Autogenesis is part of a growing wave of self-improving agent systems, alongside work like Meta-Harness and the Darwin Gödel Machine line, and it's one of the cleaner protocol-level takes on continual self-improvement so far.
Paper: arxiv.org/abs/2604.15034
Learn to build effective AI agents in our academy: academy.dair.ai

English

@kamdaloo22 @bridgemindai both are true. Opus 4.6 is really good - but it's not as good as it was previously.
English

@bridgemindai Why is it that everyone on X is saying that Claude opus 4.6 is “nerfed” but mine is currently doing pretty well with a really complex rust coding project? Is it just me or is opus 4.6 like really good?
English

CLAUDE OPUS 4.6 IS NERFED.
BridgeBench just proved it.
Last week Claude Opus 4.6 ranked #2 on the Hallucination benchmark with an accuracy of 83.3%.
Today Claude Opus 4.6 was retested and it fell to #10 on the leaderboard with an accuracy of only 68.3%.
A 98% increase in hallucination.
bridgebench.ai just confirmed that Claude Opus 4.6 has reduced reasoning levels and is nerfed.

English

15 AI related accounts you should follow on Twitter:
1. @karpathy
His tweets already create LLMs narratives that you later see on linkedin in 2 months.
2. @fchollet
posts thoughtful research on intelligence, benchmarks, and AI limitations. Keras creator + ARC-AGI
3. @ylecun
Yann LeCun is Deep learning pioneer & Meta Chief AI Scientist; big-picture research takes and critiques (and drama).
4. @AndrewYNg
Andrew Ng is AI education legend; practical ML advice, courses, and real-world implementation. creator of deeplearning ai
5 @rasbt
Sebastian Raschka posts on Practical ML/LLM implementations, "build from scratch" tutorials, and books.
6. @dair_ai
Weekly ML/AI paper threads and accessible research explainers (high-signal for staying current).
7. @lilianweng
Lilian Weng is ex-OpenAI and her Lil'Log-style threads are good. has In-depth LLM research breakdowns
8. @jeremyphoward
posts interesting takes on AI/crypto news, and works on democratizing practical deep learning and accessible education.
9. @simonw
Simon post Practical LLM tools, takes, experiments, prompting, and engineering breakdowns. django co-founder
10. @_akhaliq
Curates the latest arXiv papers, model releases, and open-source AI drops.
11. @ID_AA_Carmack
AGI/low-level optimization takes that makes you think about the problem.
12. @gwern
Really high-quality long-form AI research notes and essays.
13. @goodside
LLM evaluation, prompting research, and real capabilities testing
14 @drfeifei
Computer vision pioneer; human-centered AI and spatial intelligence research
15 @demishassabis
Been following his work for 9 years. Demmis is my hope against google usurpating their power with AI. Demmis is google DeepMind's CEO
Let me know who I missed guys
English

@chongdashu @boomvideoapp perfect timing! Was just trying to work through how much CC could help with a 2d game.
English

Almost done with the full video tutorial on how I vibe code 2d games
> phaser js skill
> playwright skill
> works with claude code / codex cli / cursor
Lands tomorrow!
Also first time I'm using @boomvideoapp

Chong-U@chongdashu
And don't worry - the animation bug has been fixed.
English

I want to start a community dedicated to Claude Code.
It’s become the gateway drug to coding and experiencing the power of AI for tons of people.
This will be a space for people to share killer use cases, agentic workflows, proven prompts, and connect with other CC obsessives.
Comment “Claude” if you want to join.
English

Let’s understand the universe. Just Grok it.
Nate Esparza@Nate_Esparza
Let’s understand the universe
English

@AnthropicAI Well that's pretty cool. Being forced into MS 365 at work so this is good to hear.
English

Claude Sonnet 4 and Opus 4.1 are now available in Microsoft 365 Copilot, bringing Claude’s advanced reasoning capabilities to millions of enterprise users.
Read more: anthropic.com/news/claude-no…
English

And now Antonio Gibson has lost a fumble for the Patriots...
Field Yates@FieldYates
Patriots RB Rhamondre Stevenson has lost two fumbles today.
English

I think I might have discovered the best 2D game engine for vibe coding games.
For the past few weeks, I have been making a lovecraftian fishing game with peaceful day time fishing and intense night time combat where you fend off mutated sea creatures.
And I've been using Love2D game engine for this. Love2D game engine runs on LUA, which is a compact (token-friendly) programming language that many models (GPT5, Grok Code and Sonnet) are pretty good at.
I'm considering to set up a Patreon where I'll share more in depth content (devlogs, prompts, advice, tips etc) where I'll share my learnings as I vibe code an indie game from scratch.
If you're interested in this, please let me know in the replies. Trying to gauge interest on this!
English

I put "Both of the above" but honestly it's whatever you enjoy working on the most. One thing that would be cool to see is the workflow you do on the backend to create the pieces you share. I know you show your code but I'm talking more about the process you work through to identify what you're going to visualize.
English




