Werner Kasselman

30 posts

Werner Kasselman banner
Werner Kasselman

Werner Kasselman

@wernerk_au

เข้าร่วม Kasım 2024
32 กำลังติดตาม3 ผู้ติดตาม
Werner Kasselman
Werner Kasselman@wernerk_au·
fair. possibly. If you're wanting to see everything happen, like you used to build a pc, carefully install an os, hack DOS startup +HighMEM to get as much free below 640k as possible. try these, they help, and they're foss, MIT - verivus-oss/llm-cli-gateway | it does increase costs. @wernerk/the-codex-review-gate-how-we-made-ai-agents-review-each-others-work-59e9ff5465f9" target="_blank" rel="nofollow noopener">medium.com/@wernerk/the-c…
English
0
0
0
18
Daniel Jeffries
Daniel Jeffries@Dan_Jeffries1·
TLDR, they're lying. Or they're not doing anything real. Or they're letting their agents commit mistakes. Or all of the above. I've built three different features at once in work trees but the quality noticeably goes way down and my real person / human-in-the-loop reviewer regularly has to do more work to fix what the agents screwed up. The problem is two fold. First, you can't context switch this much on hard problems and so your personal problem solving drops dramatically. Second, these models still make massive mistakes and dumb decisions and if you are not watching them closely, more slips through. And no, an agent overlord/orchestrator does not fix this because the model itself is stupid and part of the problem. It's a bandaid.
Sandi Slonjšak@sandislonjsak

My brain simply can't run more than 3 agents in parallel and QA all of their work. I am sure I am not the only one. How do people manage 10 at once? Or they simply lie?

English
13
2
70
6.1K
Werner Kasselman
Werner Kasselman@wernerk_au·
SLSA provenance on publish (proves who built what from where), SBOM on every release (proves what's inside), --ignore-scripts by default (blocks the postinstall attack vector entirely), lockfile-only installs in CI, and dependency scanning that flags new transitive deps between versions. For Node/Python: these controls aren't optional anymore. For Rust: cargo has no install hooks, this entire attack class doesn't exist. What actually helps : SLSA Build L3 provenance, cryptographic proof that a package was built from a specific commit by a specific CI pipeline, not by a compromised token on someone's laptop. SBOM (SPDX/CycloneDX) generated at build time gives you a machine-readable inventory of exactly what shipped. When the next axios happens, you grep your SBOM for plain-crypto-js instead of hoping someone remembers. npm's --provenance flag and Sigstore attestations are free, use them. On the consumer side: npm audit signatures verifies provenance. Socket/Snyk/Semgrep catch malicious postinstall hooks that npm audit misses.
English
0
0
1
293
Gergely Orosz
Gergely Orosz@GergelyOrosz·
Supply chain attacks are becoming more frequent, and far more serious. What are sensible practices to protect against these when using Node or Python packages? I assume pinning versions is the bare minimum; for those with security teams / tools: why else do you do / can you do?
Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English
95
38
514
84.5K
Werner Kasselman
Werner Kasselman@wernerk_au·
What I do: 1) --ignore-scripts by default, explicit allowlist for packages that need them 2) Socket/npm audit in CI 3) Lockfile-only installs in prod (no caret ranges resolving to latest) 4) Vendor high-risk native deps. The axios attack exploited caret ranges + npm postinstall hooks.
English
0
0
0
32
Werner Kasselman
Werner Kasselman@wernerk_au·
@GergelyOrosz The axios attack is why pinning isn't enough: ^1.14.0 auto-pulled the poisoned 1.14.1. The RAT was hidden behind a 2-layer encoded postinstall hook that self-deletes after execution. Lockfiles help, but only if you also block postinstall scripts you haven't audited.
English
1
0
1
315
Max Leiter
Max Leiter@maxleiter·
It's never been easier to make a programming language. what would one built for LLMs look like? Concise in terms of tokens, some form of contracts/formal verification, GC, declared side-effects?
English
8
1
16
2.6K
Garry Tan
Garry Tan@garrytan·
Improving /review skill on GStack tonight, just shipped
Garry Tan tweet media
English
23
4
105
9.6K
Werner Kasselman
Werner Kasselman@wernerk_au·
@davidmarcus that's why I built this - average 12 rounds to review technical documents, 6 rounds to review technical implementations. it's foss, MIT @wernerk/the-codex-review-gate-how-we-made-ai-agents-review-each-others-work-59e9ff5465f9" target="_blank" rel="nofollow noopener">medium.com/@wernerk/the-c…
English
0
0
0
157
David Marcus
David Marcus@davidmarcus·
It's wild that every time you run a Codex code review from Claude Code, it finds critical issues. Not 95% of the times, 100%.
English
239
88
2.9K
642.4K
Werner Kasselman
Werner Kasselman@wernerk_au·
Nice work! We built something similar but broader, llm-cli-gateway wraps all three CLIs (Claude, Codex, and Gemini) behind a single MCP server. Same delegation pattern, plus session management across all three, async job orchestration, and approval gates with risk scoring. The interesting part: having multiple LLMs review each other's work. Codex catches logic bugs Claude misses, Gemini finds security issues neither catches. github.com/verivus-oss/ll… @wernerk/why-cli-wrapping-beats-api-proxying-for-multi-llm-development-1ddd492c7153" target="_blank" rel="nofollow noopener">medium.com/@wernerk/why-c…
English
0
0
2
76
dominik kundel
dominik kundel@dkundel·
I built a new plugin! You can now trigger Codex from Claude Code! Use the Codex plugin for Claude Code to delegate tasks to Codex or have Codex review your changes using your ChatGPT subscription. Start by installing the plugin: github.com/openai/codex-p…
English
208
337
3.1K
1M
Sandi Slonjšak
Sandi Slonjšak@sandislonjsak·
My brain simply can't run more than 3 agents in parallel and QA all of their work. I am sure I am not the only one. How do people manage 10 at once? Or they simply lie?
English
747
39
1.6K
281K
Werner Kasselman
Werner Kasselman@wernerk_au·
I went a different way, I created an mcp server that allows Codex to call Claude, Gemini and Codex CLI's. And allows Claude to call Codex, Gemini and Claude, you get the idea... Interestingly, Claude doing designs, with Codex reviewing and improving them and Gemini doing security, ROCKS
English
0
0
0
577
Romain Huet
Romain Huet@romainhuet·
We’ve seen Claude Code users bring in Codex for code review and use GPT-5.4 for more complex tasks, so we thought: why not make that easier? Today we’re open sourcing a plugin for it! You can call Codex from Claude Code with your ChatGPT subscription. We love an open ecosystem!
dominik kundel@dkundel

I built a new plugin! You can now trigger Codex from Claude Code! Use the Codex plugin for Claude Code to delegate tasks to Codex or have Codex review your changes using your ChatGPT subscription. Start by installing the plugin: github.com/openai/codex-p…

English
259
317
5K
773K
Garry Tan
Garry Tan@garrytan·
Absolutely insane week for agentic engineering 37K LOC per day across 5 projects Still speeding up
Garry Tan tweet media
English
293
28
779
1.3M
Werner Kasselman
Werner Kasselman@wernerk_au·
@karpathy have a go for your next AI coding session. sqry.dev Parses code the way a compiler does, AST nodes, not tokens "Semantic" = structural meaning from the parse tree A Function node with Calls edges to other Function nodes is a fact, not a guess Knows the difference between 28 node kinds and 26 edge kinds with metadata (Calls{argument_count, is_async}, Imports{alias, is_wildcard}) Traverses a real graph: callers, callees, dependency impact, cycle detection, cross-language FFI linking Deterministic: same code → same graph, every time
English
0
0
0
1
Werner Kasselman รีทวีตแล้ว
Jacinta Nampijinpa
Jacinta Nampijinpa@JNampijinpa·
My thoughts in the wake of the Bondi Beach terrorist attack.
English
507
1K
5.2K
91.9K
Werner Kasselman รีทวีตแล้ว
Benny Johnson
Benny Johnson@bennyjohnson·
STEPHEN MILLER: “This is not fringe anymore. Tape, after tape. Federal workers, bureaucrats, educators, professors, nurses… people celebrating and cheering the assassination of Charlie Kirk! There is a domestic terrorist movement growing in this country”
English
3.7K
21.3K
92.6K
2.3M
R3tards Down Under
R3tards Down Under@r3tarddownunder·
This is a video that @AbbieChatfield posted back in July. I wonder if she will be sending the person who ASSASINATED Charlie Kirk some “fan mail” as she suggests in the video. When these things actually happen they want you to forget videos like this exist. Absolute scum.
English
356
245
1.7K
68.6K
Werner Kasselman รีทวีตแล้ว
George Christensen
George Christensen@NationFirstAust·
I stand with Senator @JNampijinpa Price. How about you?
George Christensen tweet media
English
283
1.3K
8.7K
76.7K