Migz

324 posts

Migz banner
Migz

Migz

@calargy_bit

building https://t.co/VlfnxTWacL - AI assisted gym workouts

Katılım Ağustos 2025
50 Takip Edilen20 Takipçiler
Migz retweetledi
hardmaru
hardmaru@hardmaru·
The human brain🧠 is incredibly efficient because it only activates the specific neurons needed for a thought. Modern LLMs naturally try to do this too (> 95% of neurons in feedforward layers stay silent for any given word), but our hardware punishes them for it. One of the most frustrating paradoxes in deep learning: making a model do less math often makes it run slower. Why? Because unstructured sparsity introduces irregular memory access, and GPUs are built for predictable, dense blocks of math. We teamed up with @NVIDIA to try to fix this hardware mismatch. Instead of forcing the GPU to adapt to the sparsity, we built a "Hybrid" format that reshapes the sparsity to fit the GPU. Our sparsity format (TwELL) dynamically routes the 99% of highly sparse tokens through a fast path, and uses a dense backup matrix as a safety valve for the rare, heavy tokens. Through TwELL and a new set of custom CUDA kernels for both LLM inference and training, we translated theoretical sparsity into actual wall-clock speedups: >20% faster training and inference on H100 GPUs, while also cutting energy consumption and memory requirements. Paper: arxiv.org/abs/2603.23198 Blog: pub.sakana.ai/sparser-faster… Code: github.com/SakanaAI/spars… ⚡️
hardmaru tweet media
Sakana AI@SakanaAILabs

How do we make LLMs faster and lighter? Don’t force the GPU to adapt to sparsity. Reshape the sparsity to fit the GPU! ⚡️ Excited to share our new #ICML2026 paper in collaboration with @NVIDIA: "Sparser, Faster, Lighter Transformer Language Models". This work introduces new open-source GPU kernels and data formats for faster inference and training of sparse transformer language models: Paper: arxiv.org/abs/2603.23198 Blog: pub.sakana.ai/sparser-faster… Code: github.com/SakanaAI/spars… While LLMs are undoubtedly powerful, they are increasingly expensive to train and deploy, with a large part of this cost coming from their feedforward layers. Yet, an interesting phenomenon occurs inside these layers: For any given token, only a small fraction of the hidden activations actually matter. The rest approximate zero, wasting computation. With ReLU and very mild L1 regularization, this sparsity can exceed 95% with little to no impact on downstream performance. So, can we leverage this sparsity to make LLMs faster? The challenge is hardware. Modern GPUs are optimized for dense matrix multiplications. Traditional sparse formats introduce irregular memory access and overheads that cancel out their theoretical savings for GEMM operations. Our contribution is twofold: 1/ We introduce TwELL (Tile-wise ELLPACK), a new sparse packing format designed to integrate directly in the same optimized tiled matmul kernels without disrupting execution. 2/ We develop custom CUDA kernels that fuse multiple sparse matmuls to maximize throughput and compress TwELL to a hybrid representation that minimizes activation sizes. We used our kernels to train and benchmark sparse LLMs at billion-parameter scales, demonstrating >20% speedups and even higher savings in peak memory and energy. This work will be presented at #ICML2026. Please check out our blog and technical paper for a deep dive!

English
48
485
3.3K
382.4K
Migz
Migz@calargy_bit·
App has been approved to the App Store 🙏🏽 Only Android missing but this feels good. Now the real work begins
English
0
0
0
3
Migz
Migz@calargy_bit·
@FiachraRM It sometimes helps. I was recently rejected by Apple because I was missing something they need according to their guidelines and honestly it will make my application better so I appreciate that from them
English
1
0
2
19
Fiachra (Fiki) 马骏
Fiachra (Fiki) 马骏@FiachraRM·
do not ever get discouraged by an App Store rejection. it is perfectly normal and happens to most people. i need you to keep going.
English
4
0
6
152
Migz
Migz@calargy_bit·
Datfit didn't have support for pounds as a weight unit, just kilograms. Recently got some feedback to include pounds, feels good to know someone from another place is actually using the app and providing usable comments.
English
0
0
1
24
Migz
Migz@calargy_bit·
Building on the go with Codex has made me think about how to change some of my merge strategy. Not long ago I started doing rebases to main instead of squashes but ever since I started a feedback loop with codex I see an amount of self correction commits that look good because you can see the thinking and resolution process, but make the git history horrendous. I think I will go back to squash merges :)
English
0
0
0
13
Migz
Migz@calargy_bit·
Just submitted my app for review to the AppStore. This is quite exciting, work just starts now. Google, on the other hand, will have to wait for at least 2 more weeks while it's in closed testing 🫨
English
1
0
0
43
Migz
Migz@calargy_bit·
I'm running an Android closed testing for Datfit, for different reasons I think this is a good exercise to expose a product, even if through testing, to potential customers. It'll take 2 weeks before I can even apply for prod. I can try iOS for prod, however :)
English
2
0
5
42
Migz
Migz@calargy_bit·
Has anyone built an app with support of LiveActivity/Dynamic Island, what did you use it for? what challenge did you encounter? I'm exploring this since it seems like a very high quality of life feature but I don't know how common it is to support it or people's feelings about it
English
0
0
4
33
Migz
Migz@calargy_bit·
No one told me submitting an app to Google was this hard lol
English
1
0
5
29
Migz
Migz@calargy_bit·
Testing actual recommendations on workouts that can be applied automatically to following weeks. You'll get a changeset of the actual modifications if there are suggestions Following weeks happen to have a follow up as well
Migz tweet media
English
0
0
3
26
Migz
Migz@calargy_bit·
@JonYekarAI Hi Jon, nah, not at this moment. I've been just investigating around building this pipeline. If this evolves i would probably pivot to hardware somehow to aid with physique development
English
1
0
1
24
Jon Yekar
Jon Yekar@JonYekarAI·
@calargy_bit hey, planning on adding computer vision stuff to it?
English
1
0
0
8
Migz
Migz@calargy_bit·
I've been playing around with a concept for visuals for Datfit. I eventually want to get these animated to show exercise form. Still trying to draft through Blender :)
Migz tweet mediaMigz tweet mediaMigz tweet media
English
1
0
4
99
Migz
Migz@calargy_bit·
Doing some work today to make datfit.app more discoverable by agents. Just building the structure and backbone for agents and thinking a bit more about the front-face of what I want put out there :)
English
1
0
3
26
Migz
Migz@calargy_bit·
Started using Hermes. Still unsure what's the best way to enhance my workflow with this but we'll be figuring out. These past few days I've been pretty keen to just have environments where I build stuff from the ground up. Started using pi.dev and now this.
English
0
0
0
21
Migz
Migz@calargy_bit·
Recently added one feature that IMO makes the app functioning as it should and closer to release. Basically you can group exercises together to form more than a superset (which consists of only two exercises) Now focusing on bugs and closing the loop with the trainer
Migz tweet mediaMigz tweet media
English
0
0
3
19
Migz
Migz@calargy_bit·
@hari__prasadd All the OSS is under the Anthropics account. Not Claude
English
1
0
1
12
Migz
Migz@calargy_bit·
Becoming the shadiest company in the industry. Can't wait until my CC subscription is over so I can move to GPT
Om Patel@om_patel5

THIS GUY LOST $200 IN ONE DAY BECAUSE THE STRING "HERMES.md" WAS IN HIS GIT COMMITS HERMES.md is a real convention used in AI agent projects. it's a system prompt specification file. not some obscure edge case he's on claude max 20x at $200 a month. yesterday claude code hit him with "you're out of extra usage" out of nowhere his dashboard showed 13% weekly usage. 0% current session. 86% of his plan was sitting there untouched but $200.98 in extra usage already burned through what should have been covered by his subscription he tried logout & login, different models, fresh installs and nothing worked anthropic support sent the ai bot (four rounds of the same scripted response). eventually they just gave up on him so he started binary searching repos and commits manually on his own time until he found the trigger the string "HERMES.md" in a recent git commit message uppercase, with the .md extension, anywhere in your commit history that's it claude code includes recent commits in its system prompt and something server side flags HERMES.md and quietly routes you off your max plan onto API rate billing > AGENTS.md? fine > README.md? fine > HERMES without .md? fine > lowercase hermes.md? fine > uppercase HERMES.md? you're getting charged API rates he reported it. anthropic support acknowledged the bug three times, called it an "authentication routing issue", thanked him for finding it then refused to refund the $200 so the man pays $200 a month for max, lost another $200 to a billing bug they confirmed, did anthropic's QA work for free on his weekend, and got a "thank you for your patience" in return check your commit history before claude code quietly drains your account too

English
0
0
0
7
Migz
Migz@calargy_bit·
Can't wait to try this out! I might switch from CC. GPT 5.4 has been showing a lot better reasoning and output quality in a quite difficult task I'm working on right now so I'm confident I'll get it done with 5.5. Not even Opus 4.7 was matching 5.4
OpenAI@OpenAI

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

English
1
0
0
65