Massively Parallel Procrastinator

4.1K posts

Massively Parallel Procrastinator banner
Massively Parallel Procrastinator

Massively Parallel Procrastinator

@SHELLEYBLEND

Shelley the blender (∂ + m) ψ = 0 Quantum Entanglement

เข้าร่วม Ekim 2013
249 กำลังติดตาม107 ผู้ติดตาม
Midjourney
Midjourney@midjourney·
Announcing a new division of Midjourney called "Midjourney Medical"
English
121
102
750
38.2K
Massively Parallel Procrastinator รีทวีตแล้ว
mr-r0b0t
mr-r0b0t@mr_r0b0t·
Here's a nice reminder that your @NousResearch Hermes sessions are a GOLD MINE for training 😍 hermesbench about to get an upgrade!
mr-r0b0t tweet media
English
3
1
23
949
氷見の女将アナウンサー
「道の駅ひみ番屋街」は、氷見の美味しいものが大集合している食のテーマパーク😋私たち地元民もよく利用します💨 岩ガキ、氷見うどん、氷見カレー、ひみぷりん、ひみこっぺなど、どれから食べるか迷ってしまう絶品グルメ5選をご紹介しました✨ お腹いっぱい氷見の味覚を満喫してくださいね🦪 \ アクセスはこんな感じ / 🚄 北陸新幹線やJR 富山駅で降りて、ローカル線を乗り継いで終点の「氷見駅」で降りたら、バスやタクシーに乗って5分ほどで到着します。 🚗 お車でドライブ旅 東京方面から高速道路で能越自動車道「氷見IC」へ。そこから番屋街までは約8分とアクセス抜群です。広々とした無料駐車場もご利用いただけます。 番屋街で美味しいものを満喫した後は、「民宿あおまさ」へぜひお越しください♨️ 氷見で水揚げされた新鮮な海の幸を心ゆくまで楽しめます!🐟✨もちろん地酒もいろいろ🍶🤤💕女将のオススメをご提案します😉 宿泊のご予約やプランの詳細は、プロフィール欄のリンクからご確認いただけます📱 週末は絶品グルメとサウナを満喫するご褒美旅へ。 富山県氷見市でお待ちしております! @himibanya #ひみ番屋街 #富山グルメ #サウナ旅
日本語
9
46
300
5.8K
Polymarket
Polymarket@Polymarket·
JUST IN: Meta’s CTO says morale is near “the worst it’s ever been” — leadership will offer increased snack budgets to lift spirits.
English
798
542
12K
4.4M
Massively Parallel Procrastinator รีทวีตแล้ว
Jim Fan
Jim Fan@DrJimFan·
I made Physical AutoResearch sound simple (conceptually), but it took a village to pull off and lots of design thinking into the robot /loopcraft. The hardest part is everything we need to setup *before* pressing Enter. Here's a behind-the-scene tour: 1. Safety harness Letting 8 robots run unattended overnight means safety has to be more than a hint in the system prompt. ENPIRE hardwires it in 2 layers: (1) hard kinematic limit that trips an immediate task failure and auto-resets as soon as a robot leaves its safety envelope, and (2) a torque-limited compliant gripper so a bad contact or misaligned insertion ends in a safe stall, instead of crushing the robot or the object at hand. We make safety more conservative than usual so humans can sleep tight. In reality, we still need a few human operators to watch over the "robots of loving grace". 2. Definition of /done An agent that can edit its own reward will game it for sure. ENPIRE fixes the goalposts before the fleet can move them. Here's the recipe: Collect a few minutes of success & failure demos -> Ask agent to write code using computer vision tools to classify success and measure against groundtruth -> Agent hill-climbs on classifier until reliably good -> This classifier becomes the real-time reward function that directly computes on sensor streams -> *Freeze* the reward function before AutoResearch. It's sacred, enshrined in a Gym env that no one can touch. 3. System telemetry design Robot-seconds is by far the scarcest resource, followed by GPU-seconds, and finally tokens. We instrument all three and surface them to ENPIRE for live resource awareness rather than letting it hill-climb in a vacuum. We define: - Mean Robot Utilization ("MRU"): the fraction of wall-clock time when the robot is actively executing an experiment. Otherwise the hardware is sitting idle and waiting for the next code commit. - Mean Token Utilization ("MTU"): tokens consumed per minute, our proxy for how hard the agent is actually thinking. A low MTU means the agent is stalled, waiting on a robot rollout to finish instead of doing research. - GPU utilization: fraction of wall-clock time when GPU is active. ... and evaluate on two budget-to-outcome metrics: 1. Tokens-to-Success: token budget the fleet burns to complete /goal. 2. Time-to-Success: wall-clock time to /goal
Jim Fan@DrJimFan

Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fleet of robots, an allocation of GPUs, and generous token budget. We set them free with a simple goal: solve the task as quickly as possible, keep the robots busy but stay safe, don't waste precious compute. Make no mistake. Then humans step aside and our watch begins. The robot fleet starts to come alive: they learn to look for visual clues, reset the scene, practice novel skills, tinker with control stack, read papers online, debate, reflect, get stuck, and try again directly on the hardware. All we did is to give Codex an API to the world of atoms, and the rest is emergence. ENPIRE is able to solve high-precision tasks like tying zip-ties, organizing fine pins, and installing GPUs all by itself. We also discovered a new type of "physical scaling": 8 robots exploring in parallel improves significantly faster than fewer ones. A part of our NVIDIA GEAR lab now self-improves tirelessly over night. We just read the reports in the morning. /goal: we all take a holiday and Jensen wouldn't even notice ;) We will be open-sourcing everything, so you can host your self-running robot lab at home too! Deep dive in the thread:

English
23
43
491
51.9K
Morgan
Morgan@morganlinton·
Morgan@morganlinton

Yesterday was a very surreal day in many ways. Waking up to see that SpaceX had acquired Cursor, and then a few hours later, watching @mntruell deliver his keynote on stage at Compile, Cursor's first user conference, was just incredible. I think this is a truly historic keynote for many reasons, very glad I recorded it, excited to share the entire thing with all of you. What an incredible journey.

English
3
1
11
1K
Morgan
Morgan@morganlinton·
Sooooo my video of @mntruell's keynote yesterday is 4GB, can't get it to upload to X. And I know nothing about video or video editing. Learning in realtime how to get this compressed down to a normal size, and adding closed captions to it because I remember @im_roy_lee saying this is important and he knows his stuff. Hopefully I can get it posted here this morning, it was a damn good keynote, on an incredibly special morning for Michael.
Morgan tweet media
English
19
1
84
9.2K
Morgan
Morgan@morganlinton·
Bingo.
Morgan tweet media
Español
2
0
15
996
Garry Tan
Garry Tan@garrytan·
I think you still need both, but the main lede is: technical founders now have access to business thinking Business founders now have access to technical thinking Net net: more startups that actually work, period
Romàn@romanbuildsaas

No one wants to admit this, but the Steve Wozniak / Steve Jobs era of "technical founder + business founder" is over. For 40 years the model was the same. One founder builds. One founder sells. That split made sense when writing code took a CS degree and distribution took a budget. It doesn't anymore. AI killed the distance between technical and non-technical. Now only one role really matters: The fullstack founder. The person who can do all of it, with AI carrying the weight. Here's what that actually means in 2026: 1. You can build, even if you've never written code. Agentic coding tools turn plain English into real, shippable products. You describe what you want, it writes it, you keep iterating. "I'm not technical" stopped being a reason to go find a cofounder. 2. You can fill your pipeline without a sales team. AI GTM agents like GojiberryAI watch for buying signals, run the outreach, and book demos while you sleep. No scraping lists, no firing off 200 cold DMs by hand. You just show up to the calls that are already warm. 3. You can create demand, not just chase it. One good post on X, LinkedIn or TikTok can put your product in front of millions, for free. Distribution used to be a budget line you needed money to unlock. Now it's a skill you learn and run yourself. 4. You can actually sell. None of the above matters if you can't get a real person to say yes. Selling is still the one thing no tool does for you, and the founder who can build, market AND close has an edge nobody can compete with. We started as 3 founders. With AI covering the work that used to need whole departments, we hit $3M ARR and 2,000+ customers in under 1 year. Each of us runs across product, outbound, content and sales. The founders winning today aren't the most technical. They're not the best marketers either. They're the ones who refused to pick a lane. You don't need a cofounder who completes you anymore. You need to become the whole stack.

English
37
16
241
24.8K
David Hendrickson
David Hendrickson@TeksEdge·
This is pretty insane! You can run one of the top LLMs, GLM-5.2, at home on 3 x DGX-Sparks available at Amazon Best Buy.
David Hendrickson tweet mediaDavid Hendrickson tweet media
jietang@jietang

We're introducing GLM-5.2, our latest flagship model for long-horizon tasks. It marks a substantial leap in long-horizon task capability over its predecessor GLM-5.1 and, for the first time, delivers that capability on a solid 1M-token context. GLM-5.2's new capabilities include: Solid 1M Context: A solid 1M-token context that stably sustains long-horizon work Advanced Coding with Flexible Effort: Stronger coding capabilities with multiple thinking effort levels to balance performance and latency Improved Architecture: We propose IndexShare, which reuses the same indexer across every four sparse attention layers, reducing per-token FLOPs by 2.9× at a 1M context length. We also improve GLM-5.2’s MTP layer for speculative decoding, increasing the acceptance length by up to 20% Pure Open: An MIT open-source license — no regional limits, technical access without borders Supporting long-horizon tasks starts with making long context engineering-usable: the model must maintain quality across long, messy coding-agent trajectories, not just accept more tokens. A 1M context is easy to claim, but much harder to keep reliable under real engineering pressure. To this end, we substantially expanded 1M-context training for coding-agent scenarios, covering large-scale implementation, automated research, performance optimization, and complex debugging. The result is a long-context system that is not only wide in scope, but solid in execution: a practical substrate for sustained engineering work. This capability is reflected in GLM-5.2's performance on three long-horizon coding benchmarks. FrontierSWE measures whether an agent can complete open-ended technical projects at the scale of hours to tens of hours, spanning systems optimization, large-scale code construction, and applied ML research. On this benchmark, GLM-5.2 trails Opus 4.8 by only 1%, while edging out GPT-5.5 by 1% and Opus 4.7 by 11%. On PostTrainBench, where each agent is given an H100 GPU and evaluated by how much it can improve small models through post-training, GLM-5.2 outperforms both Opus 4.7 and GPT-5.5, ranking second only to Opus 4.8. On SWE-Marathon, an ultra-long-horizon software engineering benchmark covering tasks such as building compilers, optimizing kernels, and developing production-grade services, GLM-5.2 still has room to grow, trailing Opus 4.8 by 13% while remaining second only to the Opus series. Across all three benchmarks, GLM-5.2 is the highest-ranked open-source model, showing that its 1M context has translated into practical long-horizon delivery capability.

English
11
4
41
5.8K
Sudo su
Sudo su@sudoingX·
i can tell exactly who read this one. accounts i remember as plain no badge regulars keep turning up in my replies with the premium+ badge now, followers climbing. i see you. it's working. if you're on a flat account wondering how people suddenly break out, the whole playbook is right up there. no fluff, go read it. and @nikitabier, i'm closing premium+ signups better than your landing page this week. won't say no to a bonus.
Sudo su@sudoingX

x.com/i/article/2065…

English
7
1
39
3.5K