Pratyay Banerjee (নীল) 

8K posts

Pratyay Banerjee (নীল)  banner
Pratyay Banerjee (নীল) 

Pratyay Banerjee (নীল) 

@Neilblaze007

I live in the shadows, but I watch everything.

127.0.0.1 Katılım Ekim 2018
7.5K Takip Edilen306 Takipçiler
Arjun
Arjun@arjunkocher·
Exclusive Self Attention (XSA). paper breakdown: k-a.in/XSA.html
Arjun tweet media
English
2
8
116
3.7K
abdel
abdel@AbdelStark·
Prediction: ZK will play an important role in AI safety, one of the most important challenge for humanity. End-to-end verifiable agents, trustless AI computation & STARK everywhere.
abdel tweet media
abdel@AbdelStark

Can LLMs be PROVABLE computers? Percepta showed that a transformer can BE a computer. Compiled weights, deterministic execution, 30k tokens/sec. But nobody asked the obvious follow-up: how do you know it computed correctly? So I built the verification layer. A STARK that proves it 👇

English
11
6
36
3.1K
Simon Willison
Simon Willison@simonw·
Turns out you can run enormous Mixture-of-Experts on Mac hardware without fitting the whole model in RAM by streaming a subset of expert weights from SSD for each generated token - and people keep finding ways to run bigger models Kimi 2.5 is 1T, but only 32B active so fits 96GB
seikixtc@seikixtc

I got a 1T-parameter model running locally on my MacBook Pro. LLM: Kimi K2.5 1,026,408,232,448 params (~1.026T) Hardware: M2 Max MacBook Pro (2023) w/ 96GB unified memory Running on MLX with a flash-style SSD streaming path + local patching. This is an experimental setup and I haven’t optimized speed yet, but it’s stable enough that I’ve started testing it in an autoresearch-style loop. #LocalAI #MLX #MoE

English
70
176
2.4K
177.4K
Pratyay Banerjee (নীল)  retweetledi
Daniel Hnyk
Daniel Hnyk@hnykda·
LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below
English
87
712
2.9K
454.5K
меoш🪖🇮🇱
меoш🪖🇮🇱@meowbooksj·
I just left @ssi It was not an easy decision. The past 12 parsecs were an absolute blast - I've been in many strange attractors in my life and can say this was by far one of the most intense non linear dynamical systems. I love time stepping. Especially being in the gooning booths with my moots, working on robots that will actually advance fertility. But the current environment wasn't serving my girth . And that's a really hard thing to admit - I've always looked up to Ilya, and I genuinely believe SSI will win. I still do. One thing I'll say: don't goon somewhere just because of the name. If you're not wage maxxing, and you know you can't mog 100x where you are - it's the right call to bail. What's next? Get some foreskin back. Then find the next hole worth shitting in. I'll always be meeting deprived anons - that was never because of a twitter bio. I just love finding smart people and helping however I can. Many more side quests to come!!!
English
18
3
292
38.6K
Pratyay Banerjee (নীল)  retweetledi
You Jiacheng
You Jiacheng@YouJiacheng·
DeepSeek: we want to offload communication to a dedicated hardware unit HUAWEI: here you are
You Jiacheng tweet mediaYou Jiacheng tweet media
English
4
32
323
43.4K
Gauri Tripathi
Gauri Tripathi@Gauri_the_great·
I know 4 stars are nothing but I don't even remember making this repo public🤔🤔🤔🤔
Gauri Tripathi tweet media
English
1
0
10
317
Pratyay Banerjee (নীল)  retweetledi
Paras Chopra
Paras Chopra@paraschopra·
My dear young person, Don’t succumb to mediocrity. There’s enough of it going around. Aspire for craftsmanship, as that is what leads to joy and beauty. The world needs more people who’re proud of what they make, and less of those who couldn’t care less.
English
62
470
4.2K
90.5K
Anshul Bhide
Anshul Bhide@anshulbhide·
“tu karke dikha”
Anshul Bhide tweet media
Indonesia
22
20
461
39.7K
Pratyay Banerjee (নীল)  retweetledi
Neel Nanda
Neel Nanda@NeelNanda5·
It is extremely important that models remain monitorable. But you can't control what you can't measure. And what does this actually mean? Opaque serial depth is one possible answer, and in this GDM AGI Safety paper Jonah analyses how to measure it in LLMs
Rohin Shah@rohinmshah

"Just read the chain of thought" is one of our best safety techniques. Why does it work? Because models can only think opaquely for a short time, long thinking must be transparent Can we quantify this? Yes! In our new paper, we show how to measure "time" for arbitrary networks.

English
3
11
98
8.9K
Pratyay Banerjee (নীল)  retweetledi
♡
@heart_jpg·
colours of the sky
♡ tweet media
English
24
1.3K
10.6K
112.7K