Simon Vans-Colina

6.5K posts

Simon Vans-Colina banner
Simon Vans-Colina

Simon Vans-Colina

@simonvc

I make banks. CTO and Co-Founder @pave_bank previously @monzo

1 au Katılım Nisan 2007
4.2K Takip Edilen6.2K Takipçiler
Simon Vans-Colina
Simon Vans-Colina@simonvc·
Someone should just freeze the weights to silicon. Unlike frontier models, once TTS is "good enough" it wont' change (humans language evolves over 100s of years) so really good candidate for freezing. A 13$ esp32 with TTS built in would be great.
Rohan Joshi@ron_joshi

Introducing Kitten TTS V0.8: open-source TTS that fits in 25MB. Three variants: 80M | 40M | 14M (<25MB) Highly expressive. Runs on CPU. Built for edge. No GPU? No problem. Ship voice anywhere. Check it out:

English
1
0
1
87
Alex Svanevik 🐧
Alex Svanevik 🐧@ASvanevik·
turned one of my GMKtec G5s into a "CTO for my hobby projects" using Hermes amazing experience it handled a FULL migration of an old project I had to a VPS that's 90% cheaper
English
6
0
20
3K
Simon Vans-Colina
Simon Vans-Colina@simonvc·
Sigh, I've had my 2 kids (ages 1 and 2) with me all day, but i left my openclaw on, and it has root on my big workstation. I decided to see if i could prompt it to build an ML pipeline to make music videos automatically. Spent all day (every 30 mins or so) giving it guidance and getting sample videos back from it. At 8pm, both kids finally asleep and i go to sit down at my machine and boom. One of my 3090s dies. Totally dead. The other one is fine so im not totally screwed but must have worked it too hard for too long. This is all i have to show for it. A stupid 10s test clip.
English
3
0
8
919
Simon Vans-Colina
Simon Vans-Colina@simonvc·
@AlecMuffett I get my claw to install and try every new model. Last 2 weeks is the first time I can see myself using one of them for something real
Simon Vans-Colina tweet media
English
0
1
1
354
Alec Muffett
Alec Muffett@AlecMuffett·
I have a 2018-era MacMini with 64Gb of RAM + negligible GPU, running a 35 billion parameter model and I’m getting 4 tokens per second out of it; tell me again how we aren’t going to be running AIs locally at home?
Alec Muffett tweet media
English
4
2
20
1.5K
Alex Svanevik 🐧
Alex Svanevik 🐧@ASvanevik·
what’s a simple, high-quality, and cost effective LLM setup for openclaw? currently testing Qwen 3.5 35B as the main model (run locally) + delegation to Claude Code w/ Opus 4.6 for coding so far so good, but give me a better setup if you have one
English
46
3
77
16.4K
Alex Svanevik 🐧
Alex Svanevik 🐧@ASvanevik·
Had to visit a bank branch. I CANNOT WAIT TILL BANKS DIE 🥳🥳🥳
English
12
0
81
5.5K
Simon Vans-Colina
Simon Vans-Colina@simonvc·
2x3090 maxxed out, 35tps on Hermes 4.5 35b q4. Open code. Niri. Omarchy. Cosy af.
Simon Vans-Colina tweet media
English
0
0
12
752
Simon Vans-Colina
Simon Vans-Colina@simonvc·
This guy is worth a follow. Pushing the limits of what's possible on home hardware. No bullshit, I've replicated his results and they're legit.
Sudo su@sudoingX

look hermes 4.3 36B has something going on. gave it the same octopus invaders prompt that qwen 3.5 built in one shot. but on 2x RTX 3090 with 128K context instead of 32K. on 1x it compacted 8 times in 24 minutes and gave up at 7 out of 10 files. intelligence was there but room wasn't. on 2x it just started grinding. 15 minutes of uninterrupted autonomous coding. zero compactions. HTML structure, full CSS, game engine, collision detection, particle system, 4 layer parallax background, enemy spawning logic, audio system. 29,000 tokens in one continuous session. both GPUs splitting load at 50%. never once stopped to compress and forget. the output quality feels different. the way it structures code, names functions, handles edge cases. dense architecture with 36 billion parameters all active on every token. you can feel the weight behind every line. qwen 3.5 built a clean game. hermes writes like it understands what it's building. 48GB of VRAM on two consumer GPUs in 2026. no h100. no cloud. two 3090s that cost less than one month of API bills for most startups. results and full game playthrough coming tonight. i want to see the design personality. NousResearch might have cooked something special here. hermes 4.3 36B. 2x RTX 3090. Q4_K_M. the full session is in the video.

English
0
0
3
544
Simon Vans-Colina
Simon Vans-Colina@simonvc·
@sudoingX I love your work. Every time you tweet I get my claw 🦞 to replicate it. Thx
Simon Vans-Colina tweet media
English
0
0
3
223
Sudo su
Sudo su@sudoingX·
look hermes 4.3 36B has something going on. gave it the same octopus invaders prompt that qwen 3.5 built in one shot. but on 2x RTX 3090 with 128K context instead of 32K. on 1x it compacted 8 times in 24 minutes and gave up at 7 out of 10 files. intelligence was there but room wasn't. on 2x it just started grinding. 15 minutes of uninterrupted autonomous coding. zero compactions. HTML structure, full CSS, game engine, collision detection, particle system, 4 layer parallax background, enemy spawning logic, audio system. 29,000 tokens in one continuous session. both GPUs splitting load at 50%. never once stopped to compress and forget. the output quality feels different. the way it structures code, names functions, handles edge cases. dense architecture with 36 billion parameters all active on every token. you can feel the weight behind every line. qwen 3.5 built a clean game. hermes writes like it understands what it's building. 48GB of VRAM on two consumer GPUs in 2026. no h100. no cloud. two 3090s that cost less than one month of API bills for most startups. results and full game playthrough coming tonight. i want to see the design personality. NousResearch might have cooked something special here. hermes 4.3 36B. 2x RTX 3090. Q4_K_M. the full session is in the video.
Sudo su@sudoingX

i added a second RTX 3090 to hermes 4.3 36B and generation speed didn't change. 35.3 tok/s on 1x. 35.53 on 2x. zero overhead. every extra byte of VRAM went to context. not speed. on a single 3090 this model fills 21.8GB at Q4_K_M. leaves room for 32K context, 22K usable. ran octopus invaders on it last night. 8 compactions. 970 lines across 7 files before it got stuck in a loop. the intelligence was there but the room wasn't. added the second GPU. pushed context until it OOM'd. 162K at q8_0 KV is the ceiling. 192K dies. that's 5x more context, 7x more usable room, and the KV cache quality doubled from 4-bit to 8-bit. this model has 512K native context. on 1x you're using 6% of what it was trained for. on 2x, 31%. it's not slow. it's starved. full flags and specs in the chart below. same octopus invaders prompt goes in next. 7x more memory. zero compactions is the target.

English
12
6
106
19.2K
Simon Vans-Colina
Simon Vans-Colina@simonvc·
@tiredkebab @dhh @AsahiLinux awesome. I used your previous version to test running linux on my m1 air. Now running asahi-fedora-niri as my daily. Bring more people to this ecosystem though.
English
1
0
1
110
Simon Vans-Colina
Simon Vans-Colina@simonvc·
@scaiado yeah i've got a bunch, tested it with two (Even had my claw build a customer bin that used the screens). next time im free going to try setting up 3 around the room and training the pose model from a webcam
English
1
0
0
98
Simon Vans-Colina
Simon Vans-Colina@simonvc·
haha just realised, the babys monitor cam is on 2.4ghz, so when the baby cries the monitor starts broadcasting and it picks this up as "motion".. it thought it was working too well.
English
0
0
1
550
Simon Vans-Colina
Simon Vans-Colina@simonvc·
i now have presence detection graphed and setting my lights. i think it works better with 2 or 3 esp32-s3s. i went back to a single macbook wifi card running linux using rssi only mode and i guess it kind of works, but with a lot of randomness.
Simon Vans-Colina tweet mediaSimon Vans-Colina tweet media
English
1
0
2
696
Simon Vans-Colina
Simon Vans-Colina@simonvc·
This is getting a lot of traction. I installed it. Things you should know. Yes single node Linux machine can estimate if someone is moving in the room. To get dense pose estimation, you need 3 esp32s, fixed in one place and a webcam and a few hundred training examples. So the only one your spying on is yourself. The single rssi based human detection is cool though. My claw can now tell if anyone is home. I might set up my 3 esp32 around the baby's cot and train it with a remote webcam so I can get remote breathing, standing up notifications
Linus ✦ Ekenstam@LinusEkenstam

We just turned WiFi signals into a radar that can see through walls and estimate exact poses of people. Surveillance just got order of magnitude more easy todo. No need for cameras. Git hub repo close to 12k ⭐️ github.com/ruvnet/wifi-de… x.com/BoWang87/statu…

English
5
0
21
9.3K
Simon Vans-Colina
Simon Vans-Colina@simonvc·
@boythornside im actually running arch/omarchy on my desktop and claw, and fedora-asahi-niri on my laptop (and i guess technically arch on my steamdeck)
English
0
0
1
95
Simon Vans-Colina
Simon Vans-Colina@simonvc·
OMG Niri is so good when you have claude making it work exactly how you want it. Models output text, and understand the terminal better than anything since they were trained on ttext. Using linux, an operating system where everything is just text, just files, is so much better.
English
2
1
3
293