notpink

31

1.9K

notpink retweetledi

naklecha@naklecha·15 Oca

btw, i was able to build simple-llm (almost) completely autonomously by running claude in a loop nonstop for 3 days (~600 million tokens) + if you actually read the code, it's pretty well written & doesn't read like ai slop!!

introducing simple-llm: a ~950 line, powerful & extensible inference engine that performs on par with vllm. enjoy :) performance (gpt-oss-120b, on an h100): - batch=1: 135 tok/s (vllm: 138) - batch=64: 4,041 tok/s (vllm: 3,846) github.com/naklecha/simpl…

English

7

1

121

12K

notpink retweetledi

naklecha@naklecha·8 Oca

introducing simple-llm: a ~950 line, powerful & extensible inference engine that performs on par with vllm. enjoy :) performance (gpt-oss-120b, on an h100): - batch=1: 135 tok/s (vllm: 138) - batch=64: 4,041 tok/s (vllm: 3,846) github.com/naklecha/simpl…

English

25

78

767

67.8K

notpink retweetledi

naklecha@naklecha·4 Oca

saturday night random idea: i'm vibe reimplementing vllm into a single & simple inference file + a folder with custom kernels. rules: - allowed libs: torch, numpy, flash-attn - must match vllm's gpt-oss-120b inference speed - can optimise for the model and hardware (h100)

English

9

5

195

11.8K

notpink retweetledi

naklecha@naklecha·1 Oca

as the year comes to an end, it’s becoming increasingly obvious that 2026 is the year of creativity. stay humble, stay creative, go crazy!!! happy new year :)

English

3

42

2.2K

notpink retweetledi

naklecha@naklecha·29 Ara

ZXX

43

1.9K

notpink retweetledi

naklecha@naklecha·10 Ara

experiencing life through the prettiest interfaces

English

0

1

35

1.7K

notpink retweetledi

naklecha@naklecha·10 Ara

building cloudy’s fileviewer was one of the most fun & unexpectedly technically challenging projects i’ve built in a while, here is a small explainer: - files on cloudy aren’t stored directly inside a bucket, instead stored as 1000s of tiny file chunks, and has it’s own filesystem metadata caching system. - why? there are many reasons for this but the main reason is to save network bandwidth. in practice from my tests doing tho gets you a 10-20x performance gain over rclone mounts or s3fs. - for example, if you have a 1TB file and you make a small change like append to the file or modify part of the file, you will need to reupload that file for every single update. that sucks. we don’t want that. so instead of we split our files into 1000s of tiny chunks and only update or append chunks that were modified. - coming back cloudy, so when you see files on cloudy’s file viewer you are actually seeing an api combine 100s of different byte chunks and forming the file in real time. - this was especially complex, because every single open source tooling in the space requires you to have a dedicated gateway which is really expensive and doest scale (sorry @paulg, but i gotta do things that scale because i would like to scale to atleast 100 people 😭) - there is a reason why you haven’t really seen file viewers for network volumes on other cloud providers, but we made it happen <3 why not use an architecture like what huggingface has built: - the underlying tech behind hf is actually very diff - there are trade-offs, but using hf as a sandbox would be pretty expensive & slow when volume sizes start hitting multiple 100GB sizes - with hf you will need to download the models on the gpu or use git based distributed file systems that are 100x slower than cloudy - on cloudy, you can mount volumes i’m within seconds (no downloads needed) and when your local ssd storage is less than your network storage, you get a 100x speed improvement using cloudy. it’s night and day. lowkey had so much fun building this out, also gives me an excuse to spend more time on gpus in the name of marketing :’) hmu if you have questions, i’m happy to answer any!!

Introducing "public volumes" on Cloudy - with this update, open-source projects can share as reproducible sandboxes that can be forked and mounted on GPU instances within seconds. Share, fork & mount 100TB+ sandboxes seamlessly with Cloudy. Here is a demo:

English

2

15

2K

notpink retweetledi

naklecha@naklecha·9 Ara

Introducing "public volumes" on Cloudy - with this update, open-source projects can share as reproducible sandboxes that can be forked and mounted on GPU instances within seconds. Share, fork & mount 100TB+ sandboxes seamlessly with Cloudy. Here is a demo:

English

4

29

6.1K

notpink retweetledi

naklecha@naklecha·29 Kas

this is your weekly reminder to stop listening to the world’s dumbass ideas on what will bring you happiness. the world is wrong, everyone is wrong. choose your own adventure!!

English

31

1.8K

notpink retweetledi

naklecha@naklecha·27 Kas

existing, experiencing & creating <3

English

0

1

18

1.3K

notpink retweetledi

naklecha@naklecha·19 Kas

for a while now, i’ve been experimenting with different daily schedules. my most productive days are: 6 hours of focused work -> 4 hours break (beach, gym, lunch, walk, think) -> 6 hours of focused work. the break in the middle makes 12 hour days, 7 days a week very sustainable.

English

2

36

2K

notpink retweetledi

naklecha@naklecha·11 Kas

i'm working on a product at cloudy, that will save teams (that spend $10k-1mil / month on gpus) a significant amount of time and money!! the current gpu market is so broken and backwards, it's actually sad. i think, cloudy can help fix that :')

English

25

1.6K

notpink retweetledi

naklecha@naklecha·7 Kas

“we can lift ourselves out of ignorance, we can find ourselves as creatures of excellence and intelligence and skill. we can be free! we can learn to fly!”

deep down, you know exactly what needs to be done this hour, this day, this year, this decade, this life. do it!

English

0

2

34

2.7K

notpink retweetledi

naklecha@naklecha·5 Kas

the tech @ cloudy is improving rapidly. for example: unlike other gpu clouds, persistent storage volumes on cloudy are region agnostic. your volumes are extremely cached & fast. also, each storage volume can store upto 1 petabyte of data, priced at $0.0002/gb/hr used!!

exactly 3 months ago, i quit my job to build cloudy. reflecting so far, it's the best decision i ever made & i'm confident that, one day (soon), cloudy will be the best gpu infrastructure product in the world!! ty for all the love & support so far <3

English

17

2.4K

notpink retweetledi

naklecha@naklecha·2 Kas

deep down, you know exactly what needs to be done this hour, this day, this year, this decade, this life. do it!

English

60

5.3K

notpink retweetledi

naklecha@naklecha·31 Eki

exactly 3 months ago, i quit my job to build cloudy. reflecting so far, it's the best decision i ever made & i'm confident that, one day (soon), cloudy will be the best gpu infrastructure product in the world!! ty for all the love & support so far <3

English

69

5.4K

notpink retweetledi

naklecha@naklecha·27 Eki

truth seeking

truth seeking

English