Nicolas Pinto

3.2K posts

Nicolas Pinto

@npinto

Stealth (Safe/Decentralized AI). Prev: Private AI @ Perceptio (acq by Apple), Scientist/Lecturer @ MIT+Harvard. Music Producer. Engineer. Angel Investor.

San Francisco, CA Katılım Ekim 2008

1.2K Takip Edilen3K Takipçiler

Nicolas Pinto retweetledi

Captain Insight@CaptainInsightX·22h

Andrej Karpathy just joined Anthropic. His new boss is the man who realised AI could train itself. You've probably never heard of him. Meet Nick Joseph 🇺🇸 > Harvard grad. No PhD. No fame. > First job: ranking charities at a nonprofit called GiveWell. > That's where he first heard the words "AI safety." > He laughed it off. Models weren't even dangerous yet. > Joined Vicarious ~ a startup trying to build AGI through robots. > Then OpenAI. Quietly. On the safety team. > Worked on something nobody was paying attention to: teaching GPT-3 to write code. > Then he watched it work. > A model. Writing the same code that trained it. > That was the moment. The future cracked open in front of him. > December 2020: he walked out of OpenAI with 10 others. > Built Anthropic from zero with Dario and Daniela Amodei. 🚀 > Today he runs the team that trains every version of Claude. > 40+ engineers. 27,000+ academic citations. > Two podcasts ever: one on AI safety (80,000 Hours, 2024), one on scaling laws (YC, 2025). Zero about himself. May 19, 2026: Andrej Karpathy joins Anthropic. He reports to Nick. The loudest minds get the headlines. The quiet ones run the labs. 🐐

English

1.2K

120.1K

Nicolas Pinto retweetledi

Base Camp Bernie@basecampbernie·3 Nis

Gemma 4 26B MoE (4B active) on a single RTX 4090: - 162 t/s decode - 8,400 t/s prefill - Full 262K native context — 19.5 GB VRAM - Only 10 Elo below the 31B dense Q8_0 on dual 4090+3090: 9,024 t/s prefill at 10K. 2,537 t/s at full 262K — that's a novel in about 100 seconds. Q4_K_M + q8_0 K / turbo3 V using @no_stp_on_snek 's TurboQuant fork (github.com/TheTom/turboqu…). KV quant saves 1.8 GB, costs nothing. 3.7x faster decode than the dense. Single 4090 (262K): llama-server -m gemma-4-26B-A4B-it-UD-Q4_K_M.gguf -c 262144 -np 1 -ctk q8_0 -ctv turbo3 -fa on --fit off --cache-ram 0 -dev CUDA0 Dual GPU (Q8_0, 262K): llama-server -m gemma-4-26B-A4B-it-Q8_0.gguf -c 262144 -np 1 -fa on --fit off --cache-ram 0 llama.cpp b8635 + turboquant fork #Gemma4 #LocalLLM #llama_cpp #TurboQuant #RTX4090 #MoE #AI #OpenSource #GGUF #LocalAI

English

438

74K

Nicolas Pinto retweetledi

Andrej Karpathy@karpathy·16 Mar

@Yulun_Du @ilyasut SGD is a ResNet too (the blocks of it are fwd+bwd), the residual stream is the weights so... 🤔 We're not taking the Attention is All You Need part literally enough? :D

English

589

103.4K

Nicolas Pinto@npinto·27 Mar

Anthropic "accidentally leaked" their next model and it's called Claude Mythos (Mytho is short for mythomane, aka pathological liar). They have the most powerful cyber security model and can't get their CMS config right? Yeahhhh right.

English

249

Nicolas Pinto@npinto·26 Mar

AI is making AI (auto)researchers delusional

Mo@atmoio

AI is making CEOs delusional

English

252

Nicolas Pinto retweetledi

Mo@atmoio·16 Mar

AI is making CEOs delusional

Indonesia

2.7K

19.6K

2.9M

Nicolas Pinto retweetledi

Charles Guillemet@P3b7_·24 Mar

Security is an economic game: make attacks too expensive to attempt. AI is breaking that equation. Exploits that took months and seven-figure budgets now take hours with an AI subscription. The old playbook won't cut it. The asymmetry that kept us secure is gone...

Charles Guillemet@P3b7_

x.com/i/article/2036…

English

130

185

28.5K

Nicolas Pinto@npinto·24 Mar

AI (software) eating (AI) software eating the world.

Daniel Hnyk@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

English

277

Nicolas Pinto retweetledi

Daniel Hnyk@hnykda·24 Mar

Oh, it just got worse. The [public github issue](github.com/BerriAI/litell…) has been closed as "not planned" by the owner, so they likely have been fully compromised.

English

975

246.6K

Nicolas Pinto@npinto·24 Mar

Oops. Who did it again?!

Microsoft Threat Intelligence@MsftSecIntel

Microsoft Threat Intelligence has observed threat actors actively experimenting with techniques to bypass or “jailbreak” AI safety controls. By reframing malicious requests, chaining instructions across multiple interactions, and misusing system‑ or developer‑style prompts, threat actors can coerce models into generating restricted content that bypasses built‑in safeguards. These techniques demonstrate how generative AI models are probed, shaped, and redirected to support reconnaissance, malware development, and social engineering while minimizing friction from moderation. AI guardrails have become dynamic surfaces that attackers test and manipulate to sustain operational advantage. As AI becomes more deeply embedded in enterprise workflows, understanding how attackers test and manipulate these guardrails is critical for defenders. Learn more about securing generative AI models on Azure AI Foundry: msft.it/6013Qs5oX

English

213

Nicolas Pinto retweetledi

Ali Behrouz@behrouz_ali·16 Mar

This paper is the same as the DeepCrossAttention (DCA) method from more than a year ago: arxiv.org/abs/2502.06785. As far as I understood, here there is no innovation to be excited about, and yet surprisingly there is no citation and discussion about DCA! The level of redundancy in LLM research and then the hype on X is getting worse and worse! DeepCrossAttention is built based on the intuition that depth-wise cross-attention allows for richer interactions between layers at different depths. DCA further provides both empirical and theoretical results to support this approach.

Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

227.8K

Nicolas Pinto retweetledi

Ali Behrouz@behrouz_ali·16 Mar

Seriously, we all should spend more time on literature review than the work itself. Contributions are getting redundant or super marginal with no clear message and a lot of missing or even wrong citations! What would be the point of science if we keep doing that?

English

159

13.4K

Nicolas Pinto retweetledi

Phillip Isola@phillip_isola·12 Mar

Very neat work on the compute-only path to intelligence:

Seungwook Han@seungwookh

Can language models learn useful priors without ever seeing language? We pre-pre-train transformers on neural cellular automata — fully synthetic, zero language. This improves language modeling by up to 6%, speeds up convergence by 40%, and strengthens downstream reasoning. Surprisingly, it even beats pre-pre-training on natural text! Blog: hanseungwook.github.io/blog/nca-pre-p… (1/n)

English

249

38.1K

Nicolas Pinto retweetledi

Bryan Catanzaro@ctnzr·11 Mar

Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: research.nvidia.com/labs/nemotron/… And yes, Ultra is coming!

English

205

1.2K

207.3K

Nicolas Pinto retweetledi

Lightning Labs⚡️🌐@lightning·11 Mar

Hey agents 👋 Looking for a payments protocol? Here's your checklist: 🔑 Pay for any API with no signup or identity 🧮 Proof of payment baked into the credential 🎟️ Delegate scoped credentials to sub-agents without issuer involvement 🔒 Private by default, no records on-chain 🌐 No single entity that can go offline, powered by bitcoin That's L402, built on Lightning. Made for machines, streamlined for vibe coders. lightning.engineering/posts/2026-03-…

English

107

19K

Nicolas Pinto@npinto·11 Mar

a simple while true loop, becoming an agentic loop, becoming a ralph loop, becoming an auto research loop, back to a simple loop frontier (lab?) land grabbing for engagement or true ignorance

English

132

Nicolas Pinto@npinto·6 Mar

This is not how privacy works...

Percy Liang@percyliang

I stopped using ChatGPT a few months ago. Since then, I have been only using oa-chat. All chat history is stored locally. Each query is sent to OpenAI under a temporary key which is unlinkable to any other query. I’m not a privacy nut, but oa-chat is such a convenient drop-in replacement for your favorite AI assistant that there’s no reason not to try it out.

English

220

Nicolas Pinto retweetledi

Arthur B.@ArthurB·24 Şub

@AnthropicAI If you build a machine capable of designing bioweapons and put it online with inadequate security, that's negligence.

English

480

Nicolas Pinto retweetledi

Lightning Labs⚡️🌐@lightning·11 Şub

AI agents can write code, send emails, and make phone calls. But they still can't transact. Today we're fixing that. Releasing a new set of tools that give agents native access to the Lightning Network: lnget for automatic L402 payments, MCP for node operations, remote signing for key isolation, and scoped credentials for spending control. The machine-payable web starts now. And bitcoin makes it possible. ⚡🦞 lightning.engineering/posts/2026-02-…

English

100

329

1.5K

291.7K

Nicolas Pinto@npinto·20 Şub

All frontier AI products are wildly unsafe. Feels like 2yo billionaires handling loaded weapons all day, even the ones who spun up competitors over safety concerns. 2026/27 gonna be bananas (not the nano kind). Worst of them all: Grok, as usual ;-). @grok -- why?!

English

137

Keşfet

@no_stp_on_snek @Yulun_Du @ilyasut @AnthropicAI @grok @elonmusk @BarackObama @taylorswift13