Werner Kasselman

30 posts

Werner Kasselman

@wernerk_au

เข้าร่วม Kasım 2024

32 กำลังติดตาม3 ผู้ติดตาม

Werner Kasselman@wernerk_au·1h

fair. possibly. If you're wanting to see everything happen, like you used to build a pc, carefully install an os, hack DOS startup +HighMEM to get as much free below 640k as possible. try these, they help, and they're foss, MIT - verivus-oss/llm-cli-gateway | it does increase costs. @wernerk/the-codex-review-gate-how-we-made-ai-agents-review-each-others-work-59e9ff5465f9" target="_blank" rel="nofollow noopener">medium.com/@wernerk/the-c…

English

Daniel Jeffries@Dan_Jeffries1·6h

TLDR, they're lying. Or they're not doing anything real. Or they're letting their agents commit mistakes. Or all of the above. I've built three different features at once in work trees but the quality noticeably goes way down and my real person / human-in-the-loop reviewer regularly has to do more work to fix what the agents screwed up. The problem is two fold. First, you can't context switch this much on hard problems and so your personal problem solving drops dramatically. Second, these models still make massive mistakes and dumb decisions and if you are not watching them closely, more slips through. And no, an agent overlord/orchestrator does not fix this because the model itself is stupid and part of the problem. It's a bandaid.

Sandi Slonjšak@sandislonjsak

My brain simply can't run more than 3 agents in parallel and QA all of their work. I am sure I am not the only one. How do people manage 10 at once? Or they simply lie?

English

6.1K

Werner Kasselman@wernerk_au·1h

SLSA provenance on publish (proves who built what from where), SBOM on every release (proves what's inside), --ignore-scripts by default (blocks the postinstall attack vector entirely), lockfile-only installs in CI, and dependency scanning that flags new transitive deps between versions. For Node/Python: these controls aren't optional anymore. For Rust: cargo has no install hooks, this entire attack class doesn't exist. What actually helps : SLSA Build L3 provenance, cryptographic proof that a package was built from a specific commit by a specific CI pipeline, not by a compromised token on someone's laptop. SBOM (SPDX/CycloneDX) generated at build time gives you a machine-readable inventory of exactly what shipped. When the next axios happens, you grep your SBOM for plain-crypto-js instead of hoping someone remembers. npm's --provenance flag and Sigstore attestations are free, use them. On the consumer side: npm audit signatures verifies provenance. Socket/Snyk/Semgrep catch malicious postinstall hooks that npm audit misses.

English

293

Gergely Orosz@GergelyOrosz·7h

Supply chain attacks are becoming more frequent, and far more serious. What are sensible practices to protect against these when using Node or Python packages? I assume pinning versions is the bare minimum; for those with security teams / tools: why else do you do / can you do?

Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios @1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English

514

84.5K

Werner Kasselman@wernerk_au·1h

What I do: 1) --ignore-scripts by default, explicit allowlist for packages that need them 2) Socket/npm audit in CI 3) Lockfile-only installs in prod (no caret ranges resolving to latest) 4) Vendor high-risk native deps. The axios attack exploited caret ranges + npm postinstall hooks.

English

Werner Kasselman@wernerk_au·1h

@GergelyOrosz The axios attack is why pinning isn't enough: ^1.14.0 auto-pulled the poisoned 1.14.1. The RAT was hidden behind a 2-layer encoded postinstall hook that self-deletes after execution. Lockfiles help, but only if you also block postinstall scripts you haven't audited.

English

315

Werner Kasselman@wernerk_au·7h

@maxleiter almost the right question... it's much deeper though

English

Max Leiter@maxleiter·11 Mar

It's never been easier to make a programming language. what would one built for LLMs look like? Concise in terms of tokens, some form of contracts/formal verification, GC, declared side-effects?

English

2.6K

Werner Kasselman@wernerk_au·9h

@garrytan superpower it @garrytan - add github.com/verivus-oss/ll…

English

Garry Tan@garrytan·10h

Improving /review skill on GStack tonight, just shipped

English

105

9.6K

Werner Kasselman@wernerk_au·9h

@davidmarcus that's why I built this - average 12 rounds to review technical documents, 6 rounds to review technical implementations. it's foss, MIT @wernerk/the-codex-review-gate-how-we-made-ai-agents-review-each-others-work-59e9ff5465f9" target="_blank" rel="nofollow noopener">medium.com/@wernerk/the-c…

English

157

David Marcus@davidmarcus·2d

It's wild that every time you run a Codex code review from Claude Code, it finds critical issues. Not 95% of the times, 100%.

English

239

2.9K

642.4K

Werner Kasselman@wernerk_au·10h

I just published The Codex Review Gate: How We Made AI Agents Review Each Other’s Work medium.com/p/the-codex-re…

English

Werner Kasselman@wernerk_au·10h

Nice work! We built something similar but broader, llm-cli-gateway wraps all three CLIs (Claude, Codex, and Gemini) behind a single MCP server. Same delegation pattern, plus session management across all three, async job orchestration, and approval gates with risk scoring. The interesting part: having multiple LLMs review each other's work. Codex catches logic bugs Claude misses, Gemini finds security issues neither catches. github.com/verivus-oss/ll… @wernerk/why-cli-wrapping-beats-api-proxying-for-multi-llm-development-1ddd492c7153" target="_blank" rel="nofollow noopener">medium.com/@wernerk/why-c…

English

dominik kundel@dkundel·20h

I built a new plugin! You can now trigger Codex from Claude Code! Use the Codex plugin for Claude Code to delegate tasks to Codex or have Codex review your changes using your ChatGPT subscription. Start by installing the plugin: github.com/openai/codex-p…

English

208

337

3.1K

Werner Kasselman@wernerk_au·16h

@sandislonjsak Agents managing agents :)

English

Sandi Slonjšak@sandislonjsak·1d

My brain simply can't run more than 3 agents in parallel and QA all of their work. I am sure I am not the only one. How do people manage 10 at once? Or they simply lie?

English

747

1.6K

281K

Werner Kasselman@wernerk_au·16h

I went a different way, I created an mcp server that allows Codex to call Claude, Gemini and Codex CLI's. And allows Claude to call Codex, Gemini and Claude, you get the idea... Interestingly, Claude doing designs, with Codex reviewing and improving them and Gemini doing security, ROCKS

English

577

Romain Huet@romainhuet·20h

We’ve seen Claude Code users bring in Codex for code review and use GPT-5.4 for more complex tasks, so we thought: why not make that easier? Today we’re open sourcing a plugin for it! You can call Codex from Claude Code with your ChatGPT subscription. We love an open ecosystem!

dominik kundel@dkundel

English

259

317

773K

Werner Kasselman@wernerk_au·1d

@garrytan check out sqry.dev will make your coding more efficient

English

Garry Tan@garrytan·1d

Absolutely insane week for agentic engineering 37K LOC per day across 5 projects Still speeding up

English

293

779

1.3M

Werner Kasselman@wernerk_au·1d

@karpathy have a go for your next AI coding session. sqry.dev Parses code the way a compiler does, AST nodes, not tokens "Semantic" = structural meaning from the parse tree A Function node with Calls edges to other Function nodes is a fact, not a guess Knows the difference between 28 node kinds and 26 edge kinds with metadata (Calls{argument_count, is_async}, Imports{alias, is_wildcard}) Traverses a real graph: callers, callees, dependency impact, cycle detection, cross-language FFI linking Deterministic: same code → same graph, every time

English

Werner Kasselman@wernerk_au·16 Ara

@JNampijinpa Thank you!

English

Werner Kasselman รีทวีตแล้ว

Jacinta Nampijinpa@JNampijinpa·15 Ara

My thoughts in the wake of the Bondi Beach terrorist attack.

English

507

5.2K

91.9K

Werner Kasselman รีทวีตแล้ว

Benny Johnson@bennyjohnson·13 Eyl

STEPHEN MILLER: “This is not fringe anymore. Tape, after tape. Federal workers, bureaucrats, educators, professors, nurses… people celebrating and cheering the assassination of Charlie Kirk! There is a domestic terrorist movement growing in this country”

English

3.7K

21.3K

92.6K

2.3M

Werner Kasselman@wernerk_au·12 Eyl

youtube.com/watch?v=azE7nq…

YouTube

ZXX

Werner Kasselman@wernerk_au·12 Eyl

@r3tarddownunder @abbiechatfield May God have mercy.

English

303

R3tards Down Under@r3tarddownunder·11 Eyl

This is a video that @AbbieChatfield posted back in July. I wonder if she will be sending the person who ASSASINATED Charlie Kirk some “fan mail” as she suggests in the video. When these things actually happen they want you to forget videos like this exist. Absolute scum.