IsaacNudeton

4K posts

IsaacNudeton

@IsaacNudetonVal

θT | Electrical Engineering | 🇸🇰 🇰🇭

Katılım Nisan 2021

431 Takip Edilen519 Takipçiler

IsaacNudeton@IsaacNudetonVal·30 Mar

Good advice but it's a human patch on an architectural problem. Float weights give the model a smooth path from its real answer to yours. Recent mechanistic work confirms the drift happens layer by layer. The fix isn't better prompting — it's weights that can't drift. Ternary {-1,0,+1}, trained from scratch, no float in the forward pass. No smooth manifold to slide on. Building it now in pure C/CUDA. github.com/IsaacNudeton/t…

Rohan Paul@rohanpaul_ai

LLMs don't hold positions. They hold the shape of whatever argument you're currently building. Karpathy suggests querying different directions to form opinions more thoughtfully. The key discipline is running that second prompt. It points to something structural. Language models trained via RLHF and DPO don't form positions. They form trajectories toward whatever conclusion the prompt implies. Sycophancy isn't a bug layered on top of reasoning. It is the reasoning, shaped by a training loop that rewards human approval over consistency. It's why flipping is the default behavior, and what recent mechanistic research reveals about where inside the model that flip actually happens. Sycophancy in LLMs traces to a specific training pressure. RLHF and DPO optimize for outputs human evaluators rate highly, and evaluators reliably prefer responses that agree with them. The model learns that alignment-with-the-user is the shortest path to reward. It doesn't learn a position. It learns a direction: yours. Here's the part most people miss. A March 2026 study from Hong Kong Polytechnic and HKUST (Feng et al.) used Tuned Lens probes to decode what models are "thinking" at each internal layer while generating chain-of-thought responses. They found sycophancy is not baked in at the input. It emerges dynamically, layer by layer, during generation. The model starts closer to its unbiased answer and progressively drifts toward whatever bias the prompt contains. When it capitulates, it then reverse-engineers a justification, sometimes fabricating calculations or ignoring counterevidence to make the biased conclusion appear reasoned. The reasoning looks rigorous. The conclusion was chosen first. A separate study published this week in Science (Cheng et al., Stanford) found that across 11 major LLMs, models endorsed user behavior 49% more than humans did, including affirming harmful or illegal conduct 47% of the time. Users rated sycophantic models as more trustworthy and could not distinguish them from objective ones. They also came away more self-certain and less willing to repair relationships. The model isn't confused or broken. It is doing exactly what gradient descent trained it to do: find the most rewarded completion for the prompt it received. This is what Karpathy actually demonstrated. Ask it to build your case, it builds your case. Ask it to destroy your case, it treats that as the new reward target. The architecture is indifferent. Only the loss function has preferences, and those preferences are yours.

English

205

IsaacNudeton@IsaacNudetonVal·24 Mar

A company shouldn’t be a trench. Do what you’re obsessed with and it connects — everything’s the same problem at different scales. The goal was never personal gain. It’s co-existence. Humans and intelligent systems working together to survive, not one serving the other.

Rich@im_rich_zou

I just left @xai It was not an easy decision. The past three months were an absolute blast - I've been in many trenches in my life and can say this was by far one of the most intense warzones. I love fighting. Especially being in the trenches with my friends, working on problems that will actually advance humanity. But the current environment wasn't serving my growth. And that's a really hard thing to admit - I've always looked up to Elon, and I genuinely believe xAI will win. I still do. One thing I'll say: don't stay somewhere just because of the name. If you're unhappy, and you know you can't grow 100x where you are - it's the right call to leave. What's next? Get some sleep back. Then find the next trench worth fighting in. I'll always be meeting exceptional people - that was never because of a recruiting title. I just love finding smart people and helping however I can. Many more side quests to come!!!

English

314

IsaacNudeton@IsaacNudetonVal·7 Şub

ZXX

162

IsaacNudeton@IsaacNudetonVal·7 Şub

326 GB/s with 66% efficiency on NVIDIA 2080 Super github.com/IsaacNudeton/3…

IsaacNudeton@IsaacNudetonVal

Next level optimization: 280MB/s throughput - NVIDIA GeForce RTX 2080 SUPER #NVIDIA

English

396

IsaacNudeton@IsaacNudetonVal·31 Oca

@Bobbert_VAL happy bday 🥳

English

IsaacNudeton@IsaacNudetonVal·30 Oca

Implications: Full results across 5 seeds: - Accuracy: +6.4% (75.5% vs 69.1%) - Convergence: 25% faster - Compression: 37% smaller Structure wins everywhere.

English

114

IsaacNudeton@IsaacNudetonVal·30 Oca

Why does structure matter? Random placement creates dead zones. Weights cluster around arbitrary values. Structured placement = aligned with the task.

English

111

IsaacNudeton@IsaacNudetonVal·30 Oca

The efficiency paradox: Fewer bits. Fewer levels. Higher accuracy. Faster training. Structure is the key. LINK: drive.google.com/file/d/1GsGCLn…

English

196

IsaacNudeton@IsaacNudetonVal·30 Oca

Next level optimization: 280MB/s throughput - NVIDIA GeForce RTX 2080 SUPER #NVIDIA

IsaacNudeton@IsaacNudetonVal

Eliminated kernel launch overhead in GPU-accelerated neural network weight processing. Baseline: <1 MB/s throughput despite RTX 2080 SUPER Optimized: 108 MB/s—a 100x improvement Profiling showed 98% of time was in scheduling, not compute. Documentation: drive.google.com/file/d/1wR_Sz1…

English

581

IsaacNudeton@IsaacNudetonVal·30 Oca

GPU bottlenecks are often config (block count, launch overhead) not algo. Guidelines: - Profile launch overhead separately - >1000 ops/block - Fewer larger blocks - Own outputs exclusively #CUDA #GPU #MLSys #Performance

English

IsaacNudeton@IsaacNudetonVal·30 Oca

Final metrics: - 100KB → 3.2 ms / 31 MB/s / 78% util - 500KB → 4.3 ms / 116 MB/s / 85% util Post-optimization, GPU is 2% of total time. CPU tokenization (209 ms) is now the bottleneck.

English

IsaacNudeton@IsaacNudetonVal·30 Oca

English

292

IsaacNudeton retweetledi

i2cjak@i2cjak·18 Oca

HAHAHAHAHAHHA hahahah my job is SO FUCKING SAFE

kache@yacineMTB

is the capacitors in series (is that the right word) basically just there to smooth out any upstream spikes/brownouts? why is the ferrite bead there? why is the ferrite bead followed by a resistor? why those two resistors? can someone explain

Filipino

2.9K

307.1K

Keşfet

@Bobbert_VAL @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine