Julian Hamann

3.2K posts

Julian Hamann

@jhamann93

Do Only Good Everyday https://t.co/5WTq1d0x0S

Hannover, Deutschland Katılım Ocak 2021

7.5K Takip Edilen262 Takipçiler

Julian Hamann retweetledi

Logan Kilpatrick@OfficialLoganK·4h

Introducing Gemma 4, our series of open weight (Apache 2.0 licensed) models, which are byte for byte the most capable open models in the world! Gemma 4 is build to run on your hardware: phones, laptops, and desktops. Frontier intelligence with a 26B MOE and a 31B Dense model!

English

246

446

4.8K

235.7K

Julian Hamann retweetledi

Florian S@airesearch12·2d

do this to protect yourself against supply chain attacks $ cat ~/.npmrc min-release-age=7 $ cat ~/.config/uv/uv.toml exclude-newer = "7 days"

Yuval Adam@yuvadm

if you don't have these in your configs you're ngmi

English

294

2.8K

397.6K

Julian Hamann@jhamann93·4d

@TheZachMueller @casper_hansen_ How do you benchmark token/s with regards to speculative decoding? I had quite different results with synthetic workloads vs. Python coding with the latter being much faster due to the code being very predictable.

English

Zach Mueller@TheZachMueller·4d

@casper_hansen_ I’m open to whatever people want. See what I did with nemotron for a peek at how extensive I’m trying to be. But there’s more in the works (just may come at a cost of won’t get out with everyone on day-0, which is okay

English

174

Casper Hansen@casper_hansen_·4d

every inference engine should have a section in their docs with exact commands to achieve best possible tokens/s on the most popular models i'm told kimi k2.5 can run at 300 tokens/s on B200s if you run nvfp4 with speculative decoding in open-source

English

200

13.8K

Julian Hamann retweetledi

Volodymyr Tretyak 🇺🇦@VolodyaTretyak·5d

Housewives in Ukraine 😅

English

474

3.1K

65K

Julian Hamann retweetledi

Matt Harrison@__mharrison__·6d

For my friends who are still using UV and might be a little weary about recent compromises to PyPi packages, stick this in your pyproject.toml. You can let all of those pip users find and report the compromises...

English

496

4.1K

280.7K

Julian Hamann@jhamann93·26 Mar

@bnafOg @bnjmn_marie In my understanding there is no reason at all to run anything larger than 8 bit in production inference. The accuracy loss from 16 to 8 is basically non-existent.

English

Bnaf.OG | 🟧@bnafOg·26 Mar

@bnjmn_marie Interesting flip side: the 9B in FP16 already fits in 8GB — no quantization needed. The 27B INT4 lands in the same VRAM envelope but with notably stronger reasoning on hard tasks. Worth profiling whether the quality delta justifies the extra inference latency for your workload.

English

2.2K

Benjamin Marie@bnjmn_marie·26 Mar

You can shrink Qwen3.5 27B by roughly 3x with little to no meaningful accuracy loss.Both INT4 and NVFP4 perform very well. The model fits largely at full context on an RTX 5090, and around 32k tokens on an RTX 4090 or 3090. Accuracy of quantized Qwen3.5 models: kaitchup.substack.com/p/qwen35-quant…

English

331

21.8K

Julian Hamann@jhamann93·26 Mar

@bnjmn_marie @sweinoid mtp:5 is crazy 🤣

English

Benjamin Marie@bnjmn_marie·26 Mar

@sweinoid Yes for nemotron. 2 for Qwen3.5. I only followed the recommendations and didn't try to tune them.

English

194

Benjamin Marie@bnjmn_marie·26 Mar

Nemotron 3 Super is the fastest ~120B model. But mostly thanks to MTP it seems, which is very well supported by vLLM for this model in particular. For Qwen3.5 122B NVFP4 models, community-made, I got a lot of MTP issues: incompatibility, memory leaks, ... For Mistral Small 4, an EAGLE model is available for speculative decoding, but vLLM fails to run it with the NVFP4 checkpoint. Full results, including accuracy and inference throughput under heavy workload, will be published on my blog Monday (link in bio)

English

4.4K

Julian Hamann@jhamann93·23 Mar

@elder_plinius Well they're not called Threat Stupid for no reason.

English

Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭@elder_plinius·23 Mar

🚨 BREAKING: Microsoft found “Someone” is actively experimenting with “jailbreak” techniques in order to bypass AI safety controls How dare! 😱

Microsoft Threat Intelligence@MsftSecIntel

Microsoft Threat Intelligence has observed threat actors actively experimenting with techniques to bypass or “jailbreak” AI safety controls. By reframing malicious requests, chaining instructions across multiple interactions, and misusing system‑ or developer‑style prompts, threat actors can coerce models into generating restricted content that bypasses built‑in safeguards. These techniques demonstrate how generative AI models are probed, shaped, and redirected to support reconnaissance, malware development, and social engineering while minimizing friction from moderation. AI guardrails have become dynamic surfaces that attackers test and manipulate to sustain operational advantage. As AI becomes more deeply embedded in enterprise workflows, understanding how attackers test and manipulate these guardrails is critical for defenders. Learn more about securing generative AI models on Azure AI Foundry: msft.it/6013Qs5oX

English

101

128

2.4K

132.7K

Julian Hamann@jhamann93·20 Mar

@bnjmn_marie Come join us: discord.gg/YpVymxvW

English

Benjamin Marie@bnjmn_marie·20 Mar

The B200 with NVFP4 Qwen3.5 MoEs: ❌vLLM -> giberrish generator ❌SGLang -> cudaErrorIllegalAddress I guess I'll have to try 2x RTX Pro 6000 (not even sure it'll work)

English

Julian Hamann@jhamann93·20 Mar

@rcarmo @badlogicgames I am pulling your leg. But the design someone asked for is literally broken.

English

Rui Carmo ☯️@rcarmo·20 Mar

@jhamann93 @badlogicgames The bit about "I fix things other people designed broken."? Yeah, well... I can't fix the world.

English

Mario Zechner@badlogicgames·20 Mar

what an exceptionally bad idea :D (no shade)

Rui Carmo ☯️@rcarmo

People of pi, github.com/rcarmo/piclaw/… now brings piclaw kicking and screaming into the world outside containers - you can now install it bearskin on a machine, although it's still experimental outside Docker /cc @badlogicgames

English

Julian Hamann@jhamann93·20 Mar

@rcarmo @badlogicgames You need to change your bio lol

English

Rui Carmo ☯️@rcarmo·20 Mar

@badlogicgames I actually got requests for the ability to install on a bare system (I should have called this the YOLO edition), and I was annoyed at the package structure, so why not break both things at once? :)

English

335

Julian Hamann retweetledi

Chioma@Merchimaaa·20 Mar

SHE HAS A NAME!!!!!!! Eva Ramón Gallegos successfully eliminates HPV

iza@izamamaa

🚨🇲🇽 BREAKING — Mexican Scientist Successfully Eliminates HPV.

Español

478

114.4K

576.6K

Julian Hamann retweetledi

Mario Lopez@mariolopezviva·19 Mar

AWESOME

English

366

3.3K

44.8K

3.4M

Julian Hamann@jhamann93·20 Mar

@badlogicgames What is the date?

English

105

Mario Zechner@badlogicgames·19 Mar

who here will be a AI Engineer London in April? I'm ready to have more pub visits.

English

10.1K

Julian Hamann@jhamann93·19 Mar

@sparbuchfeinde @Jojo_ge1 Wäre mir da nicht so sicher. Wenn Immobilienkredite noch unbezahlbarer werden, fällt dauerhaft viel Nachfrage weg. Dann hat man eher viel Inflation und stagnante Immopreise.

Deutsch

259

sparbuchfeinde@sparbuchfeinde·19 Mar

@Jojo_ge1 Immobilien sind zwar immobil. Aber Schuldner würden massiv von hohen Inflationsraten profitieren. Kredite werden entwertet. Häuserpreise steigen. Größtes Risiko dabei sind staatliche Eingriffe und weitere Abgaben für Immobilieneigentümer.

Deutsch

sparbuchfeinde@sparbuchfeinde·19 Mar

Die Welt steuert auf eine historische Energiekrise zu. Deutschland könnte global als größter Verlierer daraus hervorgehen. > 3. größte Industrienation der Welt > Automobil & Chemieindustrie dominierend > extreme Abhängigkeit von fossilen Rohstoffen > grundlastfähige Energie nur über Kohle & Gas Die hohen Energiekosten relativ zu anderen Ländern waren bei niedrigen Weltmarktpreisen für Öl und Gas bereits ein massiver Bremsklotz für unsere Wirtschaft und sämtliche Neuinvestitionen. Bei hohen Weltmarktpreisen für Öl und Gas sind sie kein Bremsklotz mehr, sondern führen dazu, dass deutsche Produkte auf dem Weltmarkt praktisch unverkäuflich werden. Das Ganze trifft auf einen Binnenmarkt, in dem der politische Fokus der letzten 20 Jahre stets darauf lag Energie über Steuern & Abgaben weiter zu verteuern. 75% der Deutschen heizen mit Öl und Gas. 85% aller Autos in Deutschland werden von Diesel- bzw. Benzin-Motoren betrieben. Im Sommerurlaub fliegt man gerne in den Süden. Der Preis für Jet Fuel ist seit dem Ausbruch des Irankriegs um 60% gestiegen. Den Deutschen geht schlicht das Geld für Konsum aus. Am Ende bleiben eigentlich nur zwei bzw. drei Optionen: 1.) Der Krieg endet zeitnah und Energiepreise normalisieren sich zügig. Mittlerweile wird dies als unrealistisch angesehen. Zu viel relevante Energieinfrastruktur wurde nachhaltig zerstört. 2.) Deutschland legt bei der Energiepolitik eine 180 Grad Wende hin. Steuern & Abgaben drastisch reduzieren, Kernkraftwerke reaktivieren, Fracking in Niedersachsen, Russland-Sanktionen aufheben. 3.) Wir laufen in eine Wirtschaftskrise historischen Ausmaßes. Massive Wohlstandsverluste und Massenarbeitslosigkeit. Wie seht ihr das Ganze?

Deutsch

122

1.3K

49K

Julian Hamann retweetledi

Techaktien@Techaktien1·19 Mar

Der gierige Staat sagt, der gierige Ölkonzern sei das Problem. Linke und Grüne klatschen.

Deutsch

223

2.7K

70.3K

Julian Hamann@jhamann93·19 Mar

@bernhardsson To be fair, they also presented some really good new drugs.

English

106

Erik Bernhardsson@bernhardsson·19 Mar

GTC is basically a bunch of drug addicts begging drug dealers for supply if you replace drugs with GPUs.

English

118

9.3K

Julian Hamann retweetledi

Georg Pazderski@Georg_Pazderski·18 Mar

NATO REAGIERT❗️ RUTTE: „Ich stehe in Kontakt mit vielen Verbündeten. Die Meerenge muss geöffnet werden … Die Verbündeten arbeiten daran die beste Vorgehensweise zu ermitteln.“ RUTTE versteht genau, dass es um die Sicherheit Europas geht - MERZ NICHT!

Deutsch

248

241

2.1K

73.8K

Julian Hamann retweetledi

Omar Khattab@lateinteraction·18 Mar

Wow! Quite a detailed blog from @Dropbox on how they optimize Dropbox Dash relevance judge with @DSPyOSS. Shout out to the Dropbox Dash team Eric Wang, Dmitriy Meyerzon, and @joshclemm !

Dropbox@Dropbox

How we used DSPy to turn our relevance judge into a measurable optimization loop, making it more reliable and scalable in Dropbox Dash.

English

183

18.3K

Julian Hamann@jhamann93·18 Mar

@DSPyOSS @Dropbox This post is really good. Do you know of any similar content that might not be known to everyone?

English

DSPy@DSPyOSS·18 Mar

@Dropbox thanks for writing this! it will be quite informative for the community

English

Dropbox@Dropbox·17 Mar

How we used DSPy to turn our relevance judge into a measurable optimization loop, making it more reliable and scalable in Dropbox Dash.

English

238

101.6K

Keşfet

@TheZachMueller @casper_hansen_ @bnafOg @bnjmn_marie @sweinoid @elder_plinius @rcarmo @badlogicgames