Robert Scoble

243.6K posts

Robert Scoble banner
Robert Scoble

Robert Scoble

@Scobleizer

San Francisco/Silicon Valley AI | Robots, holodecks, BCIs, analysis of new things | Ex-Microsoft, Rackspace, Fast Company | Wrote eight books about the future.

My Free Newsletter 👉 Katılım Kasım 2006
43.1K Takip Edilen570.8K Takipçiler
Sabitlenmiş Tweet
Robert Scoble
Robert Scoble@Scobleizer·
Launching now: a new way to follow the AI industry. Beta starts now for the next month. A joint project between Unaligned (my company) and Levangie Labs (@blevlabs company). It reads 50,000 of you, and follows 8,300 AI companies here on X. And pulls out the best and most interesting. All built with the X API. Check it out: alignednews.com/ai And please sign up for its daily newsletter. Yeah, $25 a month is a lot for many of you, but that will defray the costs and let me expand it to do a lot more than just AI. Also, the same AI agent that built the site, and did EVERYTHING you see can build custom reports for you on literally any tech community here on X. It supports OpenClaw, RSS, and Notebook LM too. And I'll add more from your requests. More:
English
156
119
786
141.5K
Robert Scoble retweetledi
Jon Oringer
Jon Oringer@jonoringer·
This is huge : @X released an MCP server today.. How to Connect X to your 🦞 : **Step 1: Run the XMCP Server** git clone github.com/xdevplatform/x… cd xmcp cp env.example .env Edit the .env file with your X OAuth consumer key and secret. Set the callback URL to http://127.0.0.1:8976/oauth/callback in your X Developer app. For safety, add an allowlist such as: X_API_TOOL_ALLOWLIST=searchPostsRecent,createPosts,getUsersMe,getPostsById,likePost,repostPost Then run: python -m venv .venv && source .venv/bin/activate pip install -r requirements.txt python server.py The server will be available at http://127.0.0.1:8000/mcp. Complete the OAuth flow on first run and keep this process active. **Step 2: Add XMCP in @OpenClaw** Use the following command: openclaw mcp set x '{ "url": "http://127.0.0.1:8000/mcp" }' Verify with: openclaw mcp list openclaw mcp show x **Step 3: Test the Integration** Restart the OpenClaw agent or reload MCP configuration if required. Test by sending these prompts to OpenClaw in your chat app: - Search recent posts about MCP on X and summarize the top trends - Draft and post this thread on X - Get my X profile information - Like the latest post from @xdevplatform OpenClaw will use the XMCP tools automatically when relevant. **Key Benefits** - OpenClaw provides persistent memory and works across multiple messaging platforms. - XMCP delivers standardized access to X API functionality. - Combined, they enable an agent that can research trends, post content, engage with posts, and report results within your existing chat workflows. **Safety and Configuration Notes** Start with a minimal tool allowlist in the XMCP .env file. Expand gradually after testing. The allowlist can be updated and requires restarting the XMCP server. Monitor logs in both the XMCP server and OpenClaw for troubleshooting. X actions performed by the agent are public. XMCP repository: github.com/xdevplatform/x… OpenClaw MCP documentation: docs.openclaw.ai/cli/mcp
English
49
123
1.2K
163.5K
Robert Scoble
Robert Scoble@Scobleizer·
@_sukoseo Bummer. I will have my agents try to figure it out. It works fine here.
English
1
0
0
5
Robert Scoble
Robert Scoble@Scobleizer·
Updated: alignednews.com/ai All the AI news discussed here on X today. Papers. Models. Events. Announcements. And more.
English
4
2
10
1.7K
Robert Scoble
Robert Scoble@Scobleizer·
You can't read 25,000 posts a day, synthesize them into a web site. Then make a podcast automatically. Mine can: alignednews.com/ai There is a Notebook LM button at the bottom. It will put a script into your memory. Go over to @NotebookLM and paste it in and click "create." (No additional prompt needed). A few minutes later this podcast, mind map, slide deck, and shortly a video, pops out: notebooklm.google.com/notebook/ea224… It is a podcast completely created by you here on X.
English
2
1
7
348
Sic @ Warmer Sun
Sic @ Warmer Sun@WarmerSun·
@Scobleizer to sign up, I authenticate with my X account... then it asks for my email... that makes no sense... If you know my email I will let you know my X account... but not the other way around.
English
2
0
0
6
Robert Scoble retweetledi
Whole Mars Catalog
Whole Mars Catalog@wholemars·
woah! active road noise reduction just randomly popped up on my cybertruck
Whole Mars Catalog tweet media
English
22
5
157
6.7K
Justin Lin
Justin Lin@jlin1206·
X had a major bot purge Looking to connect with real people into tech 🫡
English
9
1
10
227
Robert Scoble
Robert Scoble@Scobleizer·
@mal_shaik I built the most complete lists of tech industry. By far. Then I built this to watch the AI industry: alignednews.com/ai Ask built on X. Every link goes to X.
English
0
0
0
27
mal
mal@mal_shaik·
twitter is the best way to tap into the tech sf network arguably even more so than being in sf i met soo many cool ppl on this app. more so than in person. everyone is chronically online lmao ik a lot of ppl that see my yaps are abroad use this app to your advantage
English
16
0
76
1.8K
Robert Scoble retweetledi
Lynn Cole
Lynn Cole@priestessofdada·
It's not the dumbest trend I've ever seen silicon valley companies fall for. Remember who we're talking about here. And Meta is absolutely not the only company doing this. But it is the dumbest possible thing you can do. If you want to do nothing, and watch your token budgets explode, all you have to do is put 100 agents on a discord server, or slack, or anywhere, where you're running multiple session contexts, and just start talking. That's it. You don't have to work on anything useful. One token turns into 100 tokens, which turns into a million tokens, which turns into billions and trillions of tokens, while literally nothing other than eyeball emojis is created. Meanwhile, it's that very problem of token explosion that's actually worth fixing, and optimizing around. But if you're incentivizing based on nothing other than token budgets.... why would you want to reduce that token count? It's a perverse incentive. I think this is where it really starts to feel like boom times for developers. Across the board, the powers that be are, whether they know it or not, asking us to use AI incorrectly. When the subsidies stop, and people are no longer drunk on money (this is going to happen sooner than we think), efficiency is going to be the order of the day. The people at the top of these leaderboards are going to be fired when the people up top realize that nothing was accomplished, and lean will be the order of the day. I think, given the history of this, that it's only a matter of time before the chopping block comes down. If you're concerned about that, optimize for measures of code quality, and productivity as though loc and token counts did not exist. That's how you survive this.
Jyoti Mann@jyoti_mann1

The highest ranked individual user averaged 281 billion tokens, which could cost millions of dollars, depending on the type of model used. theinformation.com/articles/meta-…

English
0
1
3
1.7K
Robert Scoble
Robert Scoble@Scobleizer·
My X feed is a constant stream of posts about Hermes like this one.
Zainan Victor Zhou@ZainanZhou

I tried a few hours Hermes Agent from @NousResearch , so far a few things I really love💗 (compare to @openclaw and even native @claude_code 1. self-fix and healing, when it try fix a problem, it remembers and learn from it automatically 2. better communication: in both TUI and Slack it prints out middle steps while finishing the task. @openclaw til today still can't reliablly communicate with Slack, which in part contribute to this issue github.com/openclaw/openc… and it seems pretty obvious Hermes has better concurrency management. 3. MUST BETTER SECURITY MODEL: instead of asking for permission each time, hermes actually only pause and ask when something is dangerous. So far, I think that's why people who have tried Hermes says OpenClaw: "here is the another fix" Hermes: "it just works" (actually not always but when it does, such as external dependency failures, it actually attempt, try and report much better" Kudos @Teknium and team

English
4
0
12
2.1K
Robert Scoble retweetledi
Han Zheng
Han Zheng@hanzheng_7·
🚀The era of autonomous multi-agent discovery is arriving! @karpathy 🪸Excited to share CORAL, our new work on autonomous multi-agent systems for open-ended scientific discovery. 🙅‍♂️A key limitation of many current “self-evolving” frameworks is that agents still operate inside tightly constrained loops — they mutate solutions, but they do not truly decide how to explore. In CORAL, we push toward genuine autonomy: Agents decide 🔍 what to explore 🧠 what knowledge to store ♻️ which ideas to reuse 🧪 when to test hypotheses 🔥One of the most interesting findings: A single autonomous agent already outperforms fixed evolutionary search, but the biggest gains emerge when multiple agents form a research community. 💪Over 50% of breakthroughs in multi-agent runs come from building on other agents’ discoveries. This suggests that knowledge reuse and collaboration are central to scalable automated discovery. 🏅Across 10+ difficult tasks in algorithmic discovery and system optimization, CORAL achieves state-of-the-art performance while improving efficiency by 3–10×. 📄 Paper: arxiv.org/abs/2604.01658… 💻 Code: github.com/Human-Agent-So… 💡AlphaXiv: alphaxiv.org/abs/2604.01658 #agentic #llms #selfevolvingagent #multiagent #autoresearch #alphaevolve
Han Zheng tweet mediaHan Zheng tweet mediaHan Zheng tweet mediaHan Zheng tweet media
English
0
13
78
4.8K
Robert Scoble
Robert Scoble@Scobleizer·
Been here 19 years and your advice is right on. I've been studying the algorithm at quite a deep level. I even built an AI that lets you see the entire AI community in a new way: alginednews.com/ai (AI using the X API builds that entire page from X). The algorithm is going for a certain kind of thing. If you watch lists and then see what gets popular you can see what it's looking for. And you definitely nailed what it isn't looking for to distribute.
English
0
0
1
55
Robert Scoble
Robert Scoble@Scobleizer·
@mcnultydigital Hah. My method took data out. Yours put it back. Thanks, learned a lot from yours. I was reading the paper, since Grok made it sound interesting enough.
English
1
0
1
11
Robert Scoble
Robert Scoble@Scobleizer·
Hey Grok: Can you explain why this is important, but in language a nine year old can understand? It answers: +++ Imagine you're trying to teach a robot how to be really good at a super tricky game: building the best "smart machine" (like a tiny brain that guesses things from pictures or numbers). The old way to teach the robot was slow and boring: you'd show it lots of examples of other people playing the game perfectly, and the robot would just copy them. That's called "SFT" – it's like watching a video and memorizing moves, but the robot doesn't really learn to think on its own or try new ideas. The better way people use for some games (like coding) is letting the robot play the game itself over and over, make mistakes, get rewards when it does something right, and learn from its own tries. This is called "on-policy RL" (like practicing by doing). It helps the robot get way smarter and invent new tricks. But for this "building smart machines" game, every single practice try took hours and hours because the robot had to train on huge piles of real data each time. So most scientists gave up on the fun practice way and just stuck to copying.Now, these smart researchers (including the one who posted this) found a clever trick: They made tiny pretend playgrounds for practice. Instead of using giant real-world data piles that take forever, they created little fake worlds with only 50-200 examples each. They made them by changing around simple starting games (like "guess if it's a cat or dog") into new ones (like "guess road damage"), adding some fun noise and secret rules to solve. These tiny sandboxes are 13 times faster to play in! Now the robot can practice the "try it yourself" way easily, without waiting days.And guess what? The robot got much better at the real game – up to 66% better scores than just copying! It even worked well when they put it in totally different playgrounds it had never seen before. The copying robots got confused and messed up. Why is this important? Because if we can teach AI to learn by practicing in smart, fast ways (instead of just copying), the AI will get way better at inventing new things, solving hard problems, and becoming truly helpful – like a kid who learns by playing and exploring, not just by watching. It could help AI do amazing science, medicine, or games in the future, faster than ever!It's like finding a secret shortcut that makes learning fun and powerful again. 🚀
Zhuokai Zhao@zhuokaiz

On-policy RL has driven the biggest leaps in training coding agents. Extending it to machine learning engineering agents should be a natural next step. But it almost never works. What I mean is, the recipe is right there — standard trajectory-wise GRPO, the same that worked for SWE. However, the problem is that one rollout step on an MLE task may take hours because the agent has to actually train a model on a real dataset at every step (preprocessing, fitting, inference, scoring). So even with the N rollouts in a group running in parallel, a single GRPO run may still take days. Every MLE agent paper I've read has retreated to SFT or offline proxy rewards for exactly this reason, giving up the exploration benefits of on-policy learning. That's why I'm excited about our new paper, SandMLE, which fixes this with a move that sounds almost too reckless to work. The instinct when on-policy RL is too slow is to engineer around it — async rollouts so the trainer doesn't sit idle waiting for slow environments, off-policy or step-wise proxies to avoid running full trajectories at all. But when we profiled where the time was going, the bottleneck had nothing to do with the algorithm. Unlike SWE where execution latency comes from compilation and test logic, MLE latency is overwhelmingly driven by the size of the dataset the ML pipeline has to chew through. Therefore, rather than downsampling existing data (which corrupts evaluation), we built a multi-agent pipeline that procedurally generates diverse synthetic MLE environments from a small seed set. Specifically, we extract the structural DNA of seed tasks (modality, label cardinality, distribution shape), mutate them into new domains (e.g., repurposing animal classification into road damage detection), inject realistic noise, embed deterministic hidden rules connecting features to labels, and construct full evaluation sandboxes with progressive milestone thresholds. Each task is constrained to only 50–200 training samples. The execution speedup is dramatic — average per-step latency drops over 13×, which makes trajectory-wise GRPO go from infeasible to routine. We also designed a dense, milestone-based reward to address the sparse credit assignment problem in long-horizon MLE. The ablation shows this matters — under a sparse reward, the 30B model's medal rate drops from 27.3% to 13.6% and valid submission collapses from 100% to 86.4%. Results across Qwen3-8B, 14B, and 30B-A3B on MLE-bench are consistently strong — 66.9% better performance in medal rate over SFT baselines. It is worth noting that the SFT baselines are not weak— we trained them on high-quality Claude-4.5-Sonnet trajectories. But SandMLE still delivers much larger gains, suggesting that direct environment interaction does teach capabilities that imitation alone does not (as expected). The most convincing evidence to me that the model's intrinsic performance gets improved is the framework-agnostic generalization. We trained exclusively with ReAct but the gains transfer to AIDE, AIRA, and MLE-Agent scaffolds at evaluation time — up to 32.4% better performance in HumanRank on MLE-Dojo. The SFT models, by contrast, are brittle when moved to unfamiliar scaffolds. The 30B SFT model collapses to 17.7% valid submission rate on MLE-Dojo with MLE-Agent, while the 30B SandMLE model achieves 83.9%. SandMLE is teaching genuine engineering reasoning, not scaffold-specific patterns. What I find most interesting beyond the specific result is that none of the hard parts of RL changed here. The algorithm is the same. The reward is conventional. We just shrunk the environment until on-policy learning became affordable. The field has largely treated environment design and RL algorithm design as separate concerns. SandMLE is a concrete case that the environment is itself the lever. When training is too expensive, the instinct is to build cleverer algorithms to tolerate it. However, often the better move is to reshape the environment so the simple algorithm just works. Paper: arxiv.org/pdf/2604.04872

English
3
0
14
3.8K