notpink

1.1K posts

notpink banner
notpink

notpink

@notpinkart

arts & crafts

amsterdam Katılım Kasım 2022
1 Takip Edilen748 Takipçiler
notpink retweetledi
naklecha
naklecha@naklecha·
i updated my personal website: naklecha.com. tldr; if you work with gpus, i want to chat with you :)
naklecha tweet media
English
9
2
37
3.7K
notpink retweetledi
naklecha
naklecha@naklecha·
for people who run experiments on gpus: i built a cli tool that gives coding agents access to cloudy's infra (~2x cheaper than other sandbox / serverless clouds). with this cli, you can ask claude to: "finetune kimi k2 on 64 h100s & to save money test your finetune on 8 h100s"
English
5
8
67
7.8K
notpink retweetledi
naklecha
naklecha@naklecha·
life after discovering & fixing every single infiniband error known to man!!
naklecha tweet medianaklecha tweet media
English
1
1
31
1.9K
notpink retweetledi
naklecha
naklecha@naklecha·
btw, i was able to build simple-llm (almost) completely autonomously by running claude in a loop nonstop for 3 days (~600 million tokens) + if you actually read the code, it's pretty well written & doesn't read like ai slop!!
naklecha@naklecha

introducing simple-llm: a ~950 line, powerful & extensible inference engine that performs on par with vllm. enjoy :) performance (gpt-oss-120b, on an h100): - batch=1: 135 tok/s (vllm: 138) - batch=64: 4,041 tok/s (vllm: 3,846) github.com/naklecha/simpl…

English
7
1
121
12K
notpink retweetledi
naklecha
naklecha@naklecha·
introducing simple-llm: a ~950 line, powerful & extensible inference engine that performs on par with vllm. enjoy :) performance (gpt-oss-120b, on an h100): - batch=1: 135 tok/s (vllm: 138) - batch=64: 4,041 tok/s (vllm: 3,846) github.com/naklecha/simpl…
English
25
78
767
67.8K
notpink retweetledi
naklecha
naklecha@naklecha·
saturday night random idea: i'm vibe reimplementing vllm into a single & simple inference file + a folder with custom kernels. rules: - allowed libs: torch, numpy, flash-attn - must match vllm's gpt-oss-120b inference speed - can optimise for the model and hardware (h100)
English
9
5
195
11.8K
notpink retweetledi
naklecha
naklecha@naklecha·
as the year comes to an end, it’s becoming increasingly obvious that 2026 is the year of creativity. stay humble, stay creative, go crazy!!! happy new year :)
English
3
3
42
2.2K
notpink retweetledi
naklecha
naklecha@naklecha·
naklecha tweet medianaklecha tweet medianaklecha tweet medianaklecha tweet media
ZXX
1
1
43
1.9K
notpink retweetledi
naklecha
naklecha@naklecha·
experiencing life through the prettiest interfaces
naklecha tweet medianaklecha tweet medianaklecha tweet medianaklecha tweet media
English
0
1
35
1.7K
notpink retweetledi
naklecha
naklecha@naklecha·
building cloudy’s fileviewer was one of the most fun & unexpectedly technically challenging projects i’ve built in a while, here is a small explainer: - files on cloudy aren’t stored directly inside a bucket, instead stored as 1000s of tiny file chunks, and has it’s own filesystem metadata caching system. - why? there are many reasons for this but the main reason is to save network bandwidth. in practice from my tests doing tho gets you a 10-20x performance gain over rclone mounts or s3fs. - for example, if you have a 1TB file and you make a small change like append to the file or modify part of the file, you will need to reupload that file for every single update. that sucks. we don’t want that. so instead of we split our files into 1000s of tiny chunks and only update or append chunks that were modified. - coming back cloudy, so when you see files on cloudy’s file viewer you are actually seeing an api combine 100s of different byte chunks and forming the file in real time. - this was especially complex, because every single open source tooling in the space requires you to have a dedicated gateway which is really expensive and doest scale (sorry @paulg, but i gotta do things that scale because i would like to scale to atleast 100 people 😭) - there is a reason why you haven’t really seen file viewers for network volumes on other cloud providers, but we made it happen <3 why not use an architecture like what huggingface has built: - the underlying tech behind hf is actually very diff - there are trade-offs, but using hf as a sandbox would be pretty expensive & slow when volume sizes start hitting multiple 100GB sizes - with hf you will need to download the models on the gpu or use git based distributed file systems that are 100x slower than cloudy - on cloudy, you can mount volumes i’m within seconds (no downloads needed) and when your local ssd storage is less than your network storage, you get a 100x speed improvement using cloudy. it’s night and day. lowkey had so much fun building this out, also gives me an excuse to spend more time on gpus in the name of marketing :’) hmu if you have questions, i’m happy to answer any!!
naklecha tweet medianaklecha tweet media
naklecha@naklecha

Introducing "public volumes" on Cloudy - with this update, open-source projects can share as reproducible sandboxes that can be forked and mounted on GPU instances within seconds. Share, fork & mount 100TB+ sandboxes seamlessly with Cloudy. Here is a demo:

English
1
2
15
2K
notpink retweetledi
naklecha
naklecha@naklecha·
Introducing "public volumes" on Cloudy - with this update, open-source projects can share as reproducible sandboxes that can be forked and mounted on GPU instances within seconds. Share, fork & mount 100TB+ sandboxes seamlessly with Cloudy. Here is a demo:
English
2
4
29
6.1K
notpink retweetledi
naklecha
naklecha@naklecha·
this is your weekly reminder to stop listening to the world’s dumbass ideas on what will bring you happiness. the world is wrong, everyone is wrong. choose your own adventure!!
English
2
2
31
1.8K
notpink retweetledi
naklecha
naklecha@naklecha·
existing, experiencing & creating <3
English
0
1
18
1.3K
notpink retweetledi
naklecha
naklecha@naklecha·
for a while now, i’ve been experimenting with different daily schedules. my most productive days are: 6 hours of focused work -> 4 hours break (beach, gym, lunch, walk, think) -> 6 hours of focused work. the break in the middle makes 12 hour days, 7 days a week very sustainable.
English
1
2
36
2K
notpink retweetledi
naklecha
naklecha@naklecha·
i'm working on a product at cloudy, that will save teams (that spend $10k-1mil / month on gpus) a significant amount of time and money!! the current gpu market is so broken and backwards, it's actually sad. i think, cloudy can help fix that :')
English
1
1
25
1.6K
notpink retweetledi
naklecha
naklecha@naklecha·
the tech @ cloudy is improving rapidly. for example: unlike other gpu clouds, persistent storage volumes on cloudy are region agnostic. your volumes are extremely cached & fast. also, each storage volume can store upto 1 petabyte of data, priced at $0.0002/gb/hr used!!
naklecha@naklecha

exactly 3 months ago, i quit my job to build cloudy. reflecting so far, it's the best decision i ever made & i'm confident that, one day (soon), cloudy will be the best gpu infrastructure product in the world!! ty for all the love & support so far <3

English
1
1
17
2.4K
notpink retweetledi
naklecha
naklecha@naklecha·
deep down, you know exactly what needs to be done this hour, this day, this year, this decade, this life. do it!
naklecha tweet medianaklecha tweet media
English
1
1
60
5.3K
notpink retweetledi
naklecha
naklecha@naklecha·
exactly 3 months ago, i quit my job to build cloudy. reflecting so far, it's the best decision i ever made & i'm confident that, one day (soon), cloudy will be the best gpu infrastructure product in the world!! ty for all the love & support so far <3
English
2
2
69
5.4K