Simon Belak

12.6K posts

Simon Belak

@sbelak

Philosopher-hacker. Sometimes theatre director. Making myself obsolete.

Katılım Şubat 2009

1.8K Takip Edilen1.6K Takipçiler

Sabitlenmiş Tweet

Simon Belak@sbelak·27 Eyl

In preparation for @Lambda_World I talked with @ericnormand about clojure, functional programming, tools & thinking: purelyfunctional.tv/speaker-interv…

English

Simon Belak@sbelak·1d

In omnibus viis tuis cogita illum, et ipse diriget gressus tuos

Tim Hwang@timhwang

ICMI believes that Christian theology offers concrete technical methods for confronting the trickiest problems in AI safety. Today, we release a pair of papers that reproduce @PalisadeAI @apolloaievals work showing how religious framings influence corrigibility and scheming.

Eesti

Simon Belak@sbelak·2d

@hjwakerley Subaru Forester or Volvo XC70 (or the normal one)

English

115

Helen@hjwakerley·2d

What’s the best £5–7k manual car for an old-school dad? Needs to be reliable, simple, and properly solid. He’s had a 2008 A6 Avant since 2012… and not keen on anything that will beep at him. Genuine request.

English

181

64K

Simon Belak retweetledi

scriptjunkie (Matt)@scriptjunkie1·2d

"The FBI was able to forensically extract copies of incoming Signal messages from a defendant’s iPhone, even after the app was deleted, because copies of the content were saved in the device’s push notification database" Oops

English

1.4K

14.7K

851.8K

Simon Belak retweetledi

Goob@goobgleeb·4 Nis

"Only bring the bare necessities to this expedition" "What about my dog's giant gramophone?":

Art Encyclopedia@artenpedia

During an expedition of the South Pole, a dog enjoys the gramophone, 1911.

English

7.7K

114.4K

1.3M

Simon Belak retweetledi

Tenobrus@tenobrus·2 Nis

at google this was known as "buying the gnome". there's like a billion tweets about this already but basically the story goes back in like 2005 or something they were building out their shopping search system, and it was working pretty well. except for the fact that if you searched for sneakers, the top result was a garden gnome. engineers were going crazy trying to fix the ranking bug, but eventually someone noticed that the gnome listing was on ebay, and there was only one of them, and it cost like $50. so they just bought the gnome and suddenly the listing was gone, problem solved. why bother fixing software issues when you can just change the world to fit your software instead?

Tenobrus@tenobrus

if you're about to release a model that you know has the ability to reveal zerodays in every commonly used open source project you could delay release for a few years or spend another ten billion on alignment RL. or you could just secretly fix all the zerodays yourself first.

English

236

6.1K

359K

Simon Belak retweetledi

ＤＡＮ_ＡＮＴＯＮＥＬＬＩ@__el__toro__·2 Nis

I'm an engine fan. And a 2-stroke fan. And a crossplane V8 fan. I've always wondered what a 2-stroke crossplane V8 engine might sound like. If it's anything like this model created in the Engine Simulator (which is pretty accurate), I am so in. (video footage from Noffie @ yt)

English

100

1.3K

97.2K

Simon Belak@sbelak·30 Mar

@FistedFoucault @Peter_Nimitz alwayshasbeen.jpg

English

Niccolo Soldo (Fisted By Foucault)@FistedFoucault·30 Mar

@Peter_Nimitz US Foreign Policy is currently in the hands of three NYC property developers

English

131

10.7K

Simon Belak retweetledi

Oliver ೫@Prof_Kalkyl·28 Mar

Just learned about Ken Isaacs' "Superchair" (1967). Built-in book rest, shelves, lamp, drink tray, and a seat back that folds into a bed. A place for "inventive work and the individual search for peace of mind", as he put it. It was meant for people to build it themselves, hence the almost unfinished look. Blueprints were published in Popular Science in 1968.

English

162

1.5K

16.8K

820.7K

Simon Belak@sbelak·30 Mar

@memristor @bluewmist I'm sure they did. And there are a lot of high end companies that go above and beyond. Don't worry, it's still a nice bag (although quite crude, but that's part of the charm), you don't have to be so defensive.

English

240

zach@memristor·29 Mar

@sbelak @bluewmist their warranty covers defects/manufacturing issues not sure how the at-fault beyond wear/tear damage + international shipping cost is their problem sounds like they were standing by what they committed to

English

311

blue@bluewmist·29 Mar

What is a 'buy it for life' item that is offensively expensive, but the moment you use it, you realize your entire life before that point was a lie?

English

1.1K

145

16.3K

8.1M

Simon Belak@sbelak·29 Mar

@memristor @bluewmist When I bought the bag they were positioned as rugged outdoor equipment, where supporting repair is very much standard (especially if one lives on another continent). To add insult to injure they were heavily using their warranty in their marketing.

English

346

zach@memristor·29 Mar

@sbelak @bluewmist that's standard - i don't know any luxury brand that would send parts to the customer. repairs are always done in-house.

English

Simon Belak@sbelak·29 Mar

@memristor @bluewmist Have one. My puppy chewed the lid strap. Shipping to and fro the US plus repair fee would be more than a new bag. They wouldn't sell me just a replacement strap as it has their logo on (to prevent counterfitting), not willing to do one without the logo either. Never again.

English

2.2K

zach@memristor·29 Mar

@bluewmist Saddleback Leather Briefcase, this thing is going to outlast me by 2 generations

English

576

302.7K

Simon Belak@sbelak·26 Mar

@Petrolicious The link to subscribe on your site seems to be broken

English

Petrolicious@Petrolicious·24 Mar

The Petrolicious Post isn’t meant to sit untouched or sealed away as a collectible. It’s designed to be used, lived with, and worn in, something you can flip through, fold up, leave on a workbench, or pin to a garage wall. It belongs in the rhythm of everyday life, not behind glass. This is print without preciousness. A publication meant to be handled, shared, and even discarded, because another issue is always on the way. The value isn’t in preserving it, it’s in experiencing it. Find out why we’re doing it and why it matters: bit.ly/4rPBYrm Drive Tastefully®⁠ #Petrolicious #DriveTastefully #PetroliciousPost

English

606

Simon Belak retweetledi

Marwa ElDiwiny@MarwaEldiwiny·19 Mar

Rosheim Joint.... a linkage-heavy wrist / spherical joint with a ±90° range of motion, developed in 1989. A brilliant invention by Mark Rosheim, often overlooked, Shoutout to David Kindlon at Apple for bringing this to my attention, must-know for anyone into robotics design!

English

483

4.3K

194.8K

Simon Belak retweetledi

Boze Herrington, Library Owl 😴🧙‍♀️@SketchesbyBoze·16 Mar

Boze Herrington, Library Owl 😴🧙‍♀️ tweet media

aya@wint3rbunny0

what a boring world, without giants, without dragons, without monsters under bridges, without vampires in castles...

ZXX

212

16K

142.9K

2.4M

Simon Belak retweetledi

Roger Abbot@HeyRAbbot·14 Mar

'Kids' or anyone who likes cool old cars and going to the track. Here is another good way to start. Plus you would be working for some really cool guys on some really cool cars. Do it!

Brian Dolan - Waste Automation@BPD1776

Looking at hiring a second technician for the classic car/race shop...this is not our competition...but it is worth noting... You can graduate high school and do this work. Pull in $80k-$130k...pretty good money for being 19yo

English

908

Simon Belak retweetledi

💻🐴Ngnghm@Ngnghm·11 Mar

@Pitometsu @rustaceans_rs The "programming language" paradigm demands no developer interaction after the program starts—of course it cannot deal with evolution, by definition. The "system" paradigm supports evolution after the program starts—of course it allows developer interaction after program start.

English

438

Simon Belak retweetledi

Teng Yan@tengyanAI·10 Mar

The most important sentence in Karpathy's whole post is probably this: anything with a measurable score and fast feedback will become something agents can optimize for you. automatically with no humans involved.

Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

175

2.1K

152.3K

Simon Belak retweetledi

Justin Skycak@justinskycak·10 Mar

When you "hit a wall" in something you are trying to learn, it's typically just a massive debt of unlearned prerequisites that are finally being called due.

English

1.3K

15.1K

246.3K

Simon Belak@sbelak·10 Mar

@micahsays Fuck yes. I'm super curious how that turns out.

English

micah@micahsays·9 Mar

exactly what i’m working on rn. more news soon

Simon Belak@sbelak

In a roundabout way we are going to rediscover lisp

English

Simon Belak@sbelak·9 Mar

In a roundabout way we are going to rediscover lisp

Sam Hogan 🇺🇸@samhogan

What if a codebase was actually stored in Postgres and agents directly modified files by reading/writing to the DB? Code velocity has increased 3-5x. This will undoubtedly continue. PR review has already become a bottleneck for high output teams. Codebase checked-out on filesystem seems like a terrible primitive when you have 10-100-1000 agents writing code. Code is now high velocity data and should be modeled at such. Bare minimum, we need write-level atomicity and better coordination across agents, better synchronization primitives for subscribing to codebase state changes and real-time time file-level code lint/fmt/review. The current ~20 year old paradigm of git checkout/branch/push/pr/review/rebase ended Jan 2026. We need an entirely new foundational system for writing code if we’re really going to keep pace with scale laws.

English

303

Keşfet

@hjwakerley @FistedFoucault @Peter_Nimitz @memristor @bluewmist @Petrolicious @elonmusk @BarackObama