Leon Derczynski ⚒️☁️🏔️🌲

25.5K posts

Leon Derczynski ⚒️☁️🏔️🌲 banner
Leon Derczynski ⚒️☁️🏔️🌲

Leon Derczynski ⚒️☁️🏔️🌲

@LeonDerczynski

NLP/ML/language/security. Principal research scientist @NVIDIA, & Prof @ITUkbh. Views ostensibly professional. llmsec stan acct

Seattle / Copenhagen Katılım Ocak 2012
1.1K Takip Edilen6.5K Takipçiler
Sabitlenmiş Tweet
Leon Derczynski ⚒️☁️🏔️🌲
Proud to announce: 💫 garak - an LLM vulnerability scanner💫 🔎 Check if a model is susceptible to common attacks 🦜 Supports HuggingFace, OpenAI, ggml, Cohere, ... 🔧 >70 probes: prompt injection, false claims, toxicity, encoding evasion, .. github.com/leondz/garak/
English
7
72
337
63.3K
Dmitrii Kovanikov
Dmitrii Kovanikov@ChShersh·
I miss the toxic tech culture of receiving 50+ code review comments. Now everyone just pushes AI slop and doesn’t care as long as CI is green.
English
142
194
3.9K
133.5K
Brendan Dolan-Gavitt
Brendan Dolan-Gavitt@moyix·
Wow you ask a couple INNOCENT questions about running some INNOCENT binaries on your wife's laptop and all of a sudden it's some kind of big investigation
Brendan Dolan-Gavitt tweet media
English
6
0
51
2.9K
Nikos Aletras
Nikos Aletras@nikaletras·
Hot take: *ACL venues should align with the broader ML community by adopting a ~25-30% acceptance rate for Main, while maintaining ~15% for Findings. This is rather straightforward and would not massively impact the overall size of our conferences. ⬇️
English
3
0
9
2.1K
Nikos Aletras
Nikos Aletras@nikaletras·
I find really bizarre that main conference acceptance rates at top #NLProc venues (*ACL, EMNLP) hover around 19-22%, while top ML venues (NeurIPS, ICLR, ICML) consistently accept 25-30% or more. This is problematic for our community for two main reasons ⬇️
English
8
10
75
18.1K
Leon Derczynski ⚒️☁️🏔️🌲 retweetledi
🎭
🎭@deepfates·
im curious about the history of "prompts" -- as in, the > or $ or whatever in your terminal, not the text string for AI models. unfortunately this is impossible to google now
English
31
6
332
21.6K
Leon Derczynski ⚒️☁️🏔️🌲 retweetledi
Leon Derczynski ⚒️☁️🏔️🌲
@segyges "form carries meaning" section is /really/ short on argumentation and citations. the arguments are definitely out there, in droves - do you not want to bring them in instead of the sloppy hand-waving here?
English
1
0
3
1.3K
SE Gyges
SE Gyges@segyges·
"Stochastic Parrots" is a meme that won't go away. It seemed important enough to do a rundown of everything that is wrong with the technical or "philosophy of language" side of the paper (which is everything). 👇
English
8
14
137
21.7K
Leon Derczynski ⚒️☁️🏔️🌲 retweetledi
NVIDIA AI Developer
NVIDIA AI Developer@NVIDIAAIDev·
30M downloads and counting for the NVIDIA Nemotron family on @huggingface 🤗 We're grateful for the incredible community that has made this possible. Get started with Nemotron: nvda.ws/4q8MtVP
NVIDIA AI Developer tweet media
English
8
21
117
18.8K
sarah
sarah@s4rah_dev·
@guyrleech they are call biscuits here in North America so I wasn’t too sure….
English
8
0
3
1.5K
Alex Greenland
Alex Greenland@ajrgd·
@s4rah_dev please tell me this is bait? afternoon tea, with jam and clotted cream. big debate on which goes on first (devon vs cornwall). i'm devon
English
2
0
19
532
Arvid Kahl
Arvid Kahl@arvidkahl·
Is there already such a thing as an "external hardware LLM" like we have external hard drives? Instead of having to run/maintain a local model, I want an inference machine that I can just plug in and point my prompts at. Single GPU, maybe a few in parallel. Who's building this?
Arvid Kahl tweet media
English
620
143
3K
404.7K
Sara Hooker
Sara Hooker@sarahookr·
Hands down one of the best meals yet I have had in London.
Sara Hooker tweet media
English
50
11
1.3K
114.4K
Obsolete Sony
Obsolete Sony@ObsoleteSony·
What are your top 3 must-play PS1 games for someone who has never experienced the console before?
Obsolete Sony tweet media
English
266
92
962
56.5K
Leon Derczynski ⚒️☁️🏔️🌲 retweetledi
Tri Dao
Tri Dao@tri_dao·
Nvidia continues to put out some of the strongest and fastest open models. Pretraining and post training data are released as well, something very few orgs have done
Bryan Catanzaro@ctnzr

Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.

English
7
23
379
28.9K
Leon Derczynski ⚒️☁️🏔️🌲 retweetledi
Nathan Lambert
Nathan Lambert@natolambert·
It's an honor to be competing with Nvidia for the best models with open data, checkpoints, and code. Super excited about Nemotron 3 and Nvidia's new focus on fully open models in 2025.
Bryan Catanzaro@ctnzr

Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.

English
3
26
354
31K
Soumye Singhal
Soumye Singhal@soumyesinghal·
🚀 Nemotron 3 Nano is live! Had a blast post-training this model with a cracked team. Its strong for its size, and highly efficient at inference. And true to @nvidia's open release style: weights (BF16/FP8/base) + training recipes + code + datasets. HF: huggingface.co/collections/nv… Blog + Nano tech report: nvda.ws/48RusVt
Soumye Singhal tweet media
English
5
8
47
2.2K
Chris 🇨🇦
Chris 🇨🇦@llm_wizard·
Nemotron 3 Nano is released (and it's a banger), but more importantly: It's just as open as the last one, and it's ONLY THE FIRST ONE. Super and Ultra: OTW > Model Weights - RELEASED > Pre-Training Data - MOSTLY RELEASED > Post-Training Data - MOSTLY RELEASED > RL Environments - RELEASED (as well as a library to train the model) Tech Report, blogs, videos, guides, AND MORE.
Chris 🇨🇦 tweet media
English
9
13
158
10.6K