Massively Parallel Procrastinator

2.3K posts

Massively Parallel Procrastinator

@SHELLEYBLEND

Shelley the blender (∂ + m) ψ = 0 Quantum Entanglement

Katılım Ekim 2013

190 Takip Edilen61 Takipçiler

Massively Parallel Procrastinator@SHELLEYBLEND·35m

@barackomaba Good info as I am looking to acquire either a Strix halo or a DGX Spark! Any particular reason for Nix.OS? Thanks!

English

Crown 👑@barackomaba·23h

If you guys are interested in some strix halo benchmarks: Bench.crownbitcoin.net I benched new pflash+dflash , new MTP by unsloth, lots of other fun things. I've got this machine. Running faster than gpt for a lot of models now that's fast enough for me 🫡

English

1.2K

Massively Parallel Procrastinator@SHELLEYBLEND·1h

@VinceZcrikl @NousResearch @Teknium Great job!

English

文森.Z@VinceZcrikl·1d

这次不一样——Hermes 克隆了我自己声音。视频依旧由 Hermes + Hyperframes 全自动制作。声音处理用本地模型，一次跑出中英日三个版本。大家觉得效果怎么样？👂 能去YouTube，抖音赚钱了吗😎 @NousResearch @Teknium 原文，可能大家都听过这句话，但不知道它是《易经》中的卦辞。 "天行健，君子以自强不息" —— 出自《易经》乾卦·象辞。天道运转，从不停歇，日升日落，周而复始，从无懈怠。真正的强者，不是靠天赋，而是靠这份永不停歇的劲儿。 #hermes #voiceclone #hyperframes

中文

709

Massively Parallel Procrastinator@SHELLEYBLEND·1h

No need to ask Grok, if you are an algorithm lover, hit the repo! x.com/i/trending/205…

English

Massively Parallel Procrastinator@SHELLEYBLEND·1h

@TheAhmadOsman Good to learn about these grifters! Thanks!

English

Ahmad@TheAhmadOsman·2d

Ahmad, why are you beefing? - I am not beefing, I am protecting Local AI from turning into a scam like Crypto Alex Finn used to be called NFT King I am not letting that cancer grow in this space that I so very much care about PERIOD

Ahmad@TheAhmadOsman

This is Alex Finn He’s costed so many people their hard earned money during his Mac mini grift Now he’ll reuse the same script for the DGX Spark He doesn’t know how to highlight any hardware strengths/weaknesses Zero substance or knowledge of local AI Go grift something else

English

374

18.3K

Massively Parallel Procrastinator@SHELLEYBLEND·1h

@DegenApeDev @TheAhmadOsman Wear it well! the honor is yours!

English

DegenApeDev@DegenApeDev·2d

@TheAhmadOsman I called him out when he switched up and pivoted and this was the result

English

606

Massively Parallel Procrastinator@SHELLEYBLEND·8h

May be it's time for late spring claning!

Everlier@Everlier

Do you know which agent skills are useless in your setup? The ones you installed just in case and probably forgotten by now?

English

Massively Parallel Procrastinator@SHELLEYBLEND·9h

Following up on the earlier post on @NousResearch's Four stage Lighthouse attention!

Nous Research@NousResearch

A Lighthouse layer replaces standard scaled dot-product attention with four stages that surround, but do not modify, the attention kernel. Q, K, and V are average-pooled into an L-level pyramid with pooling factor p. Per-head norms score every pyramid entry, and a coarse-to-fine top-k cascade selects survivors at each level. The chosen entries are gathered into a contiguous, causally-sorted sub-sequence on which standard attention runs, and the outputs are scattered back to their base positions. Because the gathered sub-sequence is dense and topologically causal, the standard lower-triangular mask works as is, the forward and backward pass of the attention itself remains unchanged, and every upstream attention improvement is inherited for free. The trained model has to remain a competent dense-attention model after sparse training, so the recipe is two-stage. For the majority of the budget, the model trains with Lighthouse selection enabled. For a brief tail, selection is disabled and training continues under standard attention, with the same optimizer state and dataloader continuation. We treat this as the load-bearing claim of the work: sparse training does not compromise the model's ability to use full attention at inference.

English

Massively Parallel Procrastinator@SHELLEYBLEND·9h

Get a bucket and be happy!

clem 🤗@ClementDelangue

AI teams shouldn’t have to choose between expensive object storage and painful git workflows. @huggingface Storage is built for model weights, datasets, checkpoints and artifacts: - simple per-TB pricing - built-in CDN - Xet deduplication - private by default when needed Store your AI data where your AI work already happens: huggingface.co/storage

English

Massively Parallel Procrastinator@SHELLEYBLEND·9h

Claude for your kidney! When the budget on Anthropic runs out on "fast"!

Sudo su@sudoingX

anthropic is charging 6x for fast mode on opus 4.7. that is not a typo. standard opus 4.7: $5 per million input tokens, $25 per million output tokens. fast mode on opus 4.7: $30 input, $150 output. literal 6x on both sides. even worse, fast mode bypasses your subscription entirely. your $200 a month max plan does not cover it. from the very first token in fast mode, every single token is billed to extra usage at the 6x rate. mid-conversation toggle? the entire context gets re-billed at 6x uncached. the trade-off they are selling: 2.5x faster output for 6x cost. you are paying 2.4x more per unit of speed gained. as of today, opus 4.7 becomes the default fast mode model. toggle /fast by reflex, wake up to a bill you didn't expect. this is why i keep saying buy a gpu.

English

Massively Parallel Procrastinator@SHELLEYBLEND·9h

@sudoingX These tricks by big boys will actually drive localLLM forward. I am heating the garage with my dual 3090s and looking into a system with 128GB unified ram already.

English

Sudo su@sudoingX·1d

English

9.1K

Massively Parallel Procrastinator@SHELLEYBLEND·9h

@NVIDIAGeForce My name is Bond, James Bond! #007FirstLightRTX to win the 007 First Light game and a Geforce RTX 5080! I want to win!

English

NVIDIA GeForce@NVIDIAGeForce·11h

Recruits, your first prize is here... A custom GeForce RTX 5080 Founders Edition + PC copy of the game. Comment #007FirstLightRTX to win 👇

English

19.8K

1.2K

10.6K

1.5M

Massively Parallel Procrastinator@SHELLEYBLEND·9h

@NousResearch actually does more than developing the Hermes! Here they are working on long context pre-training! Get a load of this!

Nous Research@NousResearch

Today we release Lighthouse Attention, a selection-based hierarchical attention for long-context pre-training that delivers a 1.4-1.7× wall-clock speedup at 98K context. It runs the same forward+backward pass ~17× faster than standard attention at 512K context on a single B200, without a custom sparse attention kernel, a straight-through estimator, or an auxiliary loss. During training, queries, keys, and values are pooled symmetrically into a multi-resolution pyramid. We then score every pyramid heads, and a top-k cascade selects a small hierarchical dense sub-sequence, and after a sorting pass that enforces causality, we use standard attention for token mixing. A brief full attention resume at the end converts the checkpoint back into a competent dense-attention model. Validated this using 530M parameter Llama-3 models across 50B tokens, with up to 1M-token benchmarks across 32 B200s under context parallelism. The work on Lighthouse Attention was led by @bloc97_, @SubhoGhosh02, and @theemozilla.

English

Massively Parallel Procrastinator@SHELLEYBLEND·10h

@Tokyonobo Ouch! I have broken many bones, playing soccer but mostly due to motorcycles. I know the pain and the procedure. Please take care and get well soon!

English

Tokyonobo@Tokyonobo·1d

One reason I haven't been able to post much lately is because of this foot injury. A month ago, I tripped on the stairs at home and fractured the top of my right foot. As a result, I have to use crutches when I go out, which limits my mobility. However, I'm getting much better.

English

1.7K

Massively Parallel Procrastinator retweetledi

Yellowstone National Park@YellowstoneNPS·1d

(News Release) Top 10 things to know about visiting Yellowstone during the 2026 summer season This summer, visitors to Yellowstone are being urged to plan ahead – both to stay safe and to help protect one of America’s most iconic wild landscapes. Across 2.2 million acres, Yellowstone offers unparalleled opportunities to observe wildlife in an intact ecosystem, explore geothermal wonders that include half the world’s active geysers, and view geologic landmarks like the Grand Canyon of the Yellowstone River. By following our top 10 things to know before you go, you can make the most of your visit while helping ensure that Yellowstone remains healthy, wild, and awe-inspiring for generations to come. 👉View our top 10 tips for visiting Yellowstone this summer at: go.nps.gov/26009 Photo: Views of Sunday Geyser in Norris Geyser Basin

English

351

12.5K

Massively Parallel Procrastinator retweetledi

Teknium 🪽@Teknium·1d

I need this skill Ousia plsss

Ousia Research (οὐσία)@Agentic_wooz

Working on a “Hooded Ouroboros” collection of Herm skins. Work your Hermes with style! Kaio has already laid amazing foundational work, just trying to open up the spectrum.

English

160

9.7K

Massively Parallel Procrastinator@SHELLEYBLEND·18h

@elder_plinius Kick it off like you always do! Thrive!

English

119

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·22h

Think I’m gonna be okay, just a lil confused and feel like shit. Luckily in a country where they don’t bankrupt you for this sort of thing. Unluckily don’t speak the language so just ripped my IV out and grabbed a taxi Jason Bourne style. We are fickle creatures. Hug your loved ones 🫂

English

615

23.3K

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·22h

i just woke up in the hospital wtf

English

225

1.1K

99.8K

Massively Parallel Procrastinator@SHELLEYBLEND·22h

@Teknium I was just today commented how to cook my claws!

English

Teknium 🪽@Teknium·1d

Do way more with Hermes than just coding!

Kainan (e/λ)@AIKainan

One of my favorite submissions from our hackathon, I especially loved part of the intro video between 00:20 and 30 seconds, it made me laugh out loud. It seems like the quality of submissions just goes up with each time we do one of these ^^.

English

105

213.2K

Massively Parallel Procrastinator@SHELLEYBLEND·22h

@elonmusk Following your principles are fine but being blinded by it not!

English

Elon Musk@elonmusk·1d

Hmm

Garry Tan@garrytan

Sanders and AOC introduced a bill to pause ALL AI data center construction. 300+ local bills filed. Half of planned 2026 data centers facing delays or cancellation. Each one brings billions to local economies. The people who say they want American jobs are trying to block the biggest job creation engine since the interstate highway system.

QST

13.9K

14K

71.3K

21.2M

Massively Parallel Procrastinator@SHELLEYBLEND·22h

@kagopic I too look at it amost everyday, from USA!