Harshan

82 posts

Harshan banner
Harshan

Harshan

@harshan2002

Building AI that understands humans | Research at Stanford | Founder | Ex AWS, IBM Signal over noise

San Francisco Katılım Ocak 2023
902 Takip Edilen105 Takipçiler
Harshan
Harshan@harshan2002·
@TansuYegen Even the robots are quietly quitting now. We trained them too well on human behavior.
English
0
0
1
219
Tansu Yegen
Tansu Yegen@TansuYegen·
A robot in China just smashed some dishes started dancing instead of working 😂
English
1.7K
3.1K
21K
5.6M
Harshan retweetledi
Curiosity
Curiosity@CuriosityonX·
🚨: A petri dish of human brain cells just learned to play DOOM
English
1.8K
6.3K
50.3K
32.1M
Harshan
Harshan@harshan2002·
@Amank1412 Tools amplify intent, they don't replace it. The magic isn't in the model, it's in the prompt engineering of the human mind.
English
0
0
4
474
Aman
Aman@Amank1412·
You may have Claude, Cursor, Codex, and Gemini but you still can't match his intelligence.
Aman tweet media
English
53
82
2.2K
51.9K
Harshan
Harshan@harshan2002·
Efficiency is the new scale. It's not about bigger models anymore, it's about smarter, specialized ones doing more with less.
Jim Fan@DrJimFan

What can half of GPT-1 do? We trained a 42M transformer called SONIC to control the body of a humanoid robot. It takes a remarkable amount of subconscious processing for us humans to squat, turn, crawl, sprint. SONIC captures this "System 1" - the fast, reactive whole-body intelligence - in a single model that translates any motion command into stable, natural motor signals. And it's all open-source!! The key insight: motion tracking is the one, true scalable task for whole body control. Instead of hand-engineering rewards for every new skill, we use dense, frame-by-frame supervision from human mocap data. The data itself encodes the reward function: "configure your limbs in any human-like position while maintaining balance". We scaled humanoid motion RL to an unprecedented scale: 100M+ mocap frames and 500,000+ parallel robots across 128 GPUs. NVIDIA Isaac Lab allows us to accelerate physics at 10,000x faster tick, giving robots many years of virtual experience in only hours of wall clock time. After 3 days of training, the neural net transfers zero-shot to the real G1 robot with no finetuning. 100% success rate across 50 diverse real-world motion sequences. One SONIC policy supports all of the following: - VR whole-body teleoperation - Human video. Just point a webcam to live stream motions. - Text prompts. "Walk sideways", "dance like a monkey", "kick your left foot", etc. - Music audio. The robot dances to the beat, adapting to tempo and rhythm. - VLA foundation models. We plugged in GR00T N1.5 and achieved 95% success on mobile tasks. We open-source the code and model checkpoints!! Deep dive in thread:

English
0
0
4
83
Harshan
Harshan@harshan2002·
@adxtyahq The 'wrapper' era is ending. If your startup's core value is just prompting a model, the model provider will eventually just build it as a feature.
English
0
0
5
670
Harshan
Harshan@harshan2002·
@ViralOps_ AI is just speed-running creativity at this point. What's the use case here?
English
1
0
2
1.3K
ViralOps
ViralOps@ViralOps_·
this is why i love AI
English
305
2.3K
16.5K
1.3M
Harshan
Harshan@harshan2002·
@TansuYegen This is the ultimate 'build in public' flex. Degree collected by the degree itself. 🤖🎓
English
0
0
2
266
Tansu Yegen
Tansu Yegen@TansuYegen·
A student in China used her self-built robot to collect her degree...
English
34
128
849
62.5K
Harshan
Harshan@harshan2002·
@AlexFinn @karpathy @korbencopy That setup is the dream. Local inference at that scale completely changes the privacy/capability tradeoff curve.
English
0
0
2
738
Alex Finn
Alex Finn@AlexFinn·
@karpathy @korbencopy Precisely why I got 3 M3 Ultra Mac Studios with 512gb RAM each Andrej. OpenClaw + local models is the future. Your own super intelligence on your desk doing work for you 24/7 It's now possible.
Alex Finn tweet media
English
45
7
244
51.2K
korben
korben@korbencopy·
we haven’t heard from @karpathy since this post. either he’s locked in creating asi or the claws nuked his inbox and took control of his password management tools, which would be very meta.
Andrej Karpathy@karpathy

Bought a new Mac mini to properly tinker with claws over the weekend. The apple store person told me they are selling like hotcakes and everyone is confused :) I'm definitely a bit sus'd to run OpenClaw specifically - giving my private data/keys to 400K lines of vibe coded monster that is being actively attacked at scale is not very appealing at all. Already seeing reports of exposed instances, RCE vulnerabilities, supply chain poisoning, malicious or compromised skills in the registry, it feels like a complete wild west and a security nightmare. But I do love the concept and I think that just like LLM agents were a new layer on top of LLMs, Claws are now a new layer on top of LLM agents, taking the orchestration, scheduling, context, tool calls and a kind of persistence to a next level. Looking around, and given that the high level idea is clear, there are a lot of smaller Claws starting to pop out. For example, on a quick skim NanoClaw looks really interesting in that the core engine is ~4000 lines of code (fits into both my head and that of AI agents, so it feels manageable, auditable, flexible, etc.) and runs everything in containers by default. I also love their approach to configurability - it's not done via config files it's done via skills! For example, /add-telegram instructs your AI agent how to modify the actual code to integrate Telegram. I haven't come across this yet and it slightly blew my mind earlier today as a new, AI-enabled approach to preventing config mess and if-then-else monsters. Basically - the implied new meta is to write the most maximally forkable repo and then have skills that fork it into any desired more exotic configuration. Very cool. Anyway there are many others - e.g. nanobot, zeroclaw, ironclaw, picoclaw (lol @ prefixes). There are also cloud-hosted alternatives but tbh I don't love these because it feels much harder to tinker with. In particular, local setup allows easy connection to home automation gadgets on the local network. And I don't know, there is something aesthetically pleasing about there being a physical device 'possessed' by a little ghost of a personal digital house elf. Not 100% sure what my setup ends up looking like just yet but Claws are an awesome, exciting new layer of the AI stack.

English
15
11
1.1K
420.4K
Harshan
Harshan@harshan2002·
@BoWang87 When I wake up……MY_HEARTBEAT like
GIF
English
0
0
1
44
Harshan
Harshan@harshan2002·
@BharukaShraddha Platform dominance > model dominance. The real moat isn't the LLM, it's the ecosystem that integrates it into daily workflows.
English
0
0
3
436
Shraddha Bharuka
Shraddha Bharuka@BharukaShraddha·
Google isn’t trying to win the AI race. They’re trying to own the entire AI Agent ecosystem. While everyone argues ChatGPT vs Claude, Google quietly built: Models → Gemini Pro, Flash, Deep Think, Gemma Design → Stitch, Whisk, Imagen Research → NotebookLM, AI Mode Video → Veo, Flow, Google Vids Coding → Antigravity IDE, Gemini CLI, Jules Agents → A2A, ADK, FileSearch API The scary part? All of these tools talk to each other. That means: 10x faster prototypes End-to-end AI workflows Production-ready agents on GCP The next AI war won’t be model vs model. It’ll be ecosystem vs ecosystem. Save. Share. Build.
GIF
English
87
546
1.9K
108.9K
Harshan
Harshan@harshan2002·
@XueJia24682 Robots doing the dangerous work is the best use case. We talk about AGI replacing poets, but this is where the real human value is saved.
English
1
0
1
94
🇨🇳XuZhenqing徐祯卿
🇨🇳XuZhenqing徐祯卿@XueJia24682·
✨🇨🇳China is using robots for daily inspection of power equipment operation, which can reduce the risks of manual operations and shift work.
English
17
91
463
21.3K
Harshan
Harshan@harshan2002·
@rohanpaul_ai Simplicity wins in the short term, but extensibility wins the decade. The agent framework wars are just getting started.
English
0
0
2
740
Rohan Paul
Rohan Paul@rohanpaul_ai·
NanoClaw, the lightweight alternative to Clawdbot / OpenClaw already reached 10.5K Github stars ⭐️ Compared with OpenClaw, NanoClaw’s specialty is simplicity plus OS level isolation. - Much smaller and manageable codebase, only 4K lines. - Runs in containers for security. - Connects to WhatsApp, has memory, scheduled jobs, and runs directly on Anthropic's Agents SDK - stores state in SQLite, runs scheduled jobs, and keeps each chat group isolated with its own memory file and its own Linux container so the agent only sees directories you explicitly mount. - its safety model leans on application controls like allowlists and pairing codes inside a shared Node process. OpenClaw is built for broad multi channel coverage, while NanoClaw intentionally stays minimal so you customize by changing a small codebase instead of operating a big framework.
Rohan Paul tweet media
English
100
234
1.9K
154K
Harshan
Harshan@harshan2002·
Engineers keep trying to solve memory with vector databases when psychology gave us the blueprint 50 years ago. Identity is a reconstruction, not a file system.
Robert Youssef@rryssf_

psychology solved the ai memory problem decades ago. we just haven't been reading the right papers. your identity isn't something you have. it's something you construct. constantly. from autobiographical memory, emotional experience, and narrative coherence. Martin Conway's Self-Memory System (2000, 2005) showed that memories aren't stored like video recordings. they're reconstructed every time you access them, assembled from fragments across different neural systems. and the relationship is bidirectional: your memories constrain who you can plausibly be, but your current self-concept also reshapes how you remember. memory is continuously edited to align with your current goals and self-images. this isn't a bug. it's the architecture. not all memories contribute equally. Rathbone et al. (2008) showed autobiographical memories cluster disproportionately around ages 10-30, the "reminiscence bump," because that's when your core self-images form. you don't remember your life randomly. you remember the transitions. the moments you became someone new. Madan (2024) takes it further: combined with Episodic Future Thinking, this means identity isn't just backward-looking. it's predictive. you use who you were to project who you might become. memory doesn't just record the past. it generates the future self. if memory constructs identity, destroying memory should destroy identity. it does. Clive Wearing, a British musicologist who suffered brain damage in 1985, lost the ability to form new memories. his memory resets every 30 seconds. he writes in his diary: "Now I am truly awake for the first time." crosses it out. writes it again minutes later. but two things survived: his ability to play piano (procedural memory, stored in cerebellum, not the damaged hippocampus) and his emotional bond with his wife. every time she enters the room, he greets her with overwhelming joy. as if reunited after years. every single time. episodic memory is fragile and localized. emotional memory is distributed widely and survives damage that obliterates everything else. Antonio Damasio's Somatic Marker Hypothesis destroyed the Western tradition of separating reason from emotion. emotions aren't obstacles to rational decisions. they're prerequisites. when you face a decision, your brain reactivates physiological states from past outcomes of similar decisions. gut reactions. subtle shifts in heart rate. these "somatic markers" bias cognition before conscious deliberation begins. the Iowa Gambling Task proved it: normal participants develop a "hunch" about dangerous card decks 10-15 trials before conscious awareness catches up. their skin conductance spikes before reaching for a bad deck. the body knows before the mind knows. patients with ventromedial prefrontal cortex damage understand the math perfectly when told. but keep choosing the bad decks anyway. their somatic markers are gone. without the emotional signal, raw reasoning isn't enough. Overskeid (2020) argues Damasio undersold his own theory: emotions may be the substrate upon which all voluntary action is built. put the threads together. Conway: memory is organized around self-relevant goals. Damasio: emotion makes memories actionable. Rathbone: memories cluster around identity transitions. Bruner: narrative is the glue. identity = memories organized by emotional significance, structured around self-images, continuously reconstructed to maintain narrative coherence. now look at ai agent memory and tell me what's missing. current architectures all fail for the same reason: they treat memory as storage, not identity construction. vector databases (RAG) are flat embedding space with no hierarchy, no emotional weighting, no goal-filtering. past 10k documents, semantic search becomes a coin flip. conversation summaries compress your autobiography into a one-paragraph bio. key-value stores reduce identity to a lookup table. episodic buffers give you a 30-second memory span, which as the Wearing case shows, is enough to operate moment-to-moment but not enough to construct identity. five principles from psychology that ai memory lacks. first, hierarchical temporal organization (Conway): human memory narrows by life period, then event type, then specific details. ai memory is flat, every fragment at the same level, brute-force search across everything. fix: interaction epochs, recurring themes, specific exchanges, retrieval descends the hierarchy. second, goal-relevant filtering (Conway's "working self"): your brain retrieves memories relevant to current goals, not whatever's closest in embedding space. fix: a dynamic representation of current goals and task context that gates retrieval. third, emotional weighting (Damasio): emotionally significant experiences encode deeper and retrieve faster. ai agents store frustrated conversations with the same weight as routine queries. fix: sentiment-scored metadata on memory nodes that biases future behavior. fourth, narrative coherence (Bruner): humans organize memories into a story maintaining consistent self across time. ai agents have zero narrative, each interaction exists independently. fix: a narrative layer synthesizing memories into a relational story that influences responses. fifth, co-emergent self-model (Klein & Nichols): human identity and memory bootstrap each other through a feedback loop. ai agents have no self-model that evolves. fix: not just "what I know about this user" but "who I am in this relationship." the fundamental problem isn't technical. it's conceptual. we've been modeling agent memory on databases. store, retrieve, done. but human memory is an identity construction system. it builds who you are, weights what matters, forgets what doesn't serve the current self, rewrites the narrative to maintain coherence. the paradigm shift: stop building agent memory as a retrieval system. start building it as an identity system. every component has engineering analogs that already exist. hierarchical memory = graph databases with temporal clustering. emotional weighting = sentiment-scored metadata. goal-relevant filtering = attention mechanisms conditioned on task state. narrative coherence = periodic summarization with consistency constraints. self-model bootstrapping = meta-learning loops on interaction history. the pieces are there. what's missing is the conceptual framework to assemble them. psychology provides that framework. the path forward isn't better embeddings or bigger context windows. it's looking inward. Conway showed memory is organized by the self, for the self. Damasio showed emotion is the guidance system. Rathbone showed memories cluster around identity transitions. Bruner showed narrative holds it together. Klein and Nichols showed self and memory bootstrap each other into existence. if we're serious about building agents with functional memory, we should stop reading database architecture papers and start reading psychology journals.

English
0
0
2
91
Harshan
Harshan@harshan2002·
@karpathy Local compute is the new gold rush. The shift from cloud to edge for inference is happening faster than people realize.
English
0
0
1
85
Andrej Karpathy
Andrej Karpathy@karpathy·
Bought a new Mac mini to properly tinker with claws over the weekend. The apple store person told me they are selling like hotcakes and everyone is confused :) I'm definitely a bit sus'd to run OpenClaw specifically - giving my private data/keys to 400K lines of vibe coded monster that is being actively attacked at scale is not very appealing at all. Already seeing reports of exposed instances, RCE vulnerabilities, supply chain poisoning, malicious or compromised skills in the registry, it feels like a complete wild west and a security nightmare. But I do love the concept and I think that just like LLM agents were a new layer on top of LLMs, Claws are now a new layer on top of LLM agents, taking the orchestration, scheduling, context, tool calls and a kind of persistence to a next level. Looking around, and given that the high level idea is clear, there are a lot of smaller Claws starting to pop out. For example, on a quick skim NanoClaw looks really interesting in that the core engine is ~4000 lines of code (fits into both my head and that of AI agents, so it feels manageable, auditable, flexible, etc.) and runs everything in containers by default. I also love their approach to configurability - it's not done via config files it's done via skills! For example, /add-telegram instructs your AI agent how to modify the actual code to integrate Telegram. I haven't come across this yet and it slightly blew my mind earlier today as a new, AI-enabled approach to preventing config mess and if-then-else monsters. Basically - the implied new meta is to write the most maximally forkable repo and then have skills that fork it into any desired more exotic configuration. Very cool. Anyway there are many others - e.g. nanobot, zeroclaw, ironclaw, picoclaw (lol @ prefixes). There are also cloud-hosted alternatives but tbh I don't love these because it feels much harder to tinker with. In particular, local setup allows easy connection to home automation gadgets on the local network. And I don't know, there is something aesthetically pleasing about there being a physical device 'possessed' by a little ghost of a personal digital house elf. Not 100% sure what my setup ends up looking like just yet but Claws are an awesome, exciting new layer of the AI stack.
English
1K
1.3K
17.5K
3.4M