Greg Pazo

417 posts

Greg Pazo

@Gregpazo

If HR is reading this my opinions are my own and do not reflect those of my employer. Staff Engineer @lululemon solo founder @indubitably_ai

PNW เข้าร่วม Aralık 2025

184 กำลังติดตาม26 ผู้ติดตาม

Greg Pazo รีทวีตแล้ว

Ben Sigman@bensig·1d

My friend Milla Jovovich and I spent months creating an AI memory system with Claude. It just posted a perfect score on the standard benchmark - beating every product in the space, free or paid. It's called MemPalace, and it works nothing like anything else out there. Instead of sending your data to a background agent in the cloud, it mines your conversations locally and organizes them into a palace - a structured architecture with wings, halls, and rooms that mirrors how human memory actually works. Here is what that gets you: → Your AI knows who you are before you type a single word - family, projects, preferences, loaded in ~120 tokens → Palace architecture organizes memories by domain and type - not a flat list of facts, a navigable structure → Semantic search across months of conversations finds the answer in position 1 or 2 → AAAK compression fits your entire life context into 120 tokens - 30x lossless compression any LLM reads natively → Contradiction detection catches wrong names, wrong pronouns, wrong ages before you ever see them The benchmarks: 100% recall on LongMemEval — first perfect score ever recorded. 500/500 questions. Every question type at 100%. 92.9% on ConvoMem — more than 2x Mem0's score. 100% on LoCoMo — every multi-hop reasoning category, including temporal inference which stumps most systems. No API key. No cloud. No subscription. One dependency. Runs on your machine. Your memories never leave. MIT License. 100% Open Source. github.com/milla-jovovich…

English

412

747

7.5K

2.5M

Greg Pazo@Gregpazo·1d

@blended_jpeg No sound?

GIF

English

29.6K

yaml@blended_jpeg·1d

bad claude..

English

550

1.9K

20.4K

4.5M

Greg Pazo@Gregpazo·2d

@dexhorthy @steipete Agreed. Why even offer an SDK? It’s not going to help you at present when the underlying company changes policy on use constantly

English

518

dex@dexhorthy·2d

Concerning re: anthropic. The previous narrative just went out the window Reports of openclaw usage with the plain sdk (as was supposedly permitted) now being blocked based on system prompt, even if using the claude agent sdk I was previously a little to the anthropic side of the spectrum on this because of EXACTLY one argument - “third party harnesses don’t use caching properly, can’t be controlled with feature flags, etc” If they are blocking use of the claude agent sdk wholesale in openclaw, then this completely invalidates that argument and I desire an answer as to what is allowed and why. I am disappointed that the communications thus far have failed to articulate the reasons here, and does make it harder to trust whatever they say next. However I will maintain cautious optimism that there is a good explanation for all this beyond the cheap “rug pull” “evil” “kill all the startups” jeers

dex@dexhorthy

like I’ve said a few times, well within TOS to do this, they built the model, if they wanna give you inference at pennies on the dollar on the condition that you use their harness, great, they have the right to do this. On this topic in particular, I don’t understand the “evil” or “rugpull”, jeers. There was never any promise to give people cheap inference. Before the claude code max plan we were all paying per token to use this stuff. And we’re more or less happy to do it (sure the VC funding helps). Every enterprise I know pays per token because when you use subsidized inference, YOU are the product. “Have some cheap code, in exchange for helping to train the next gen of models” You can hate on that particular behavior if you want but nobody is making you take part in that particular market dynamic. Do I wanna see a world where model companies take some of their massive financial gains and use that to pull everybody up? Of course. I hope it happens some day. An allegory perhaps: If public e-bike company gave you a subscription on rides and you proceeded to around ripping out batteries and sticking them in your own bike and ride around town, you’d get banned for that too. Especially if your bike was poorly wired and overloaded the batteries/cause them to flame up etc. Banning that behavior would deliver far better results for the people who were using the system as designed

English

443

142K

Greg Pazo@Gregpazo·3d

Saw this coming. Anyone building CC with Codex subscription? 👀

Boris Cherny@bcherny

Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw. You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

English

Greg Pazo@Gregpazo·4d

@ScriptedAlchemy 😈 I knew someone would be interested in this too

English

2.2K

Supreme Leader Wiggum@ScriptedAlchemy·4d

Gpt 5.4 with the leaked Claude code is one of the best experiences ever.

English

281

33K

Greg Pazo@Gregpazo·5d

@om_patel5 Well played. It was real but you definitely fooled a few people with this fake post 😂

English

107

9.9K

Om Patel@om_patel5·5d

ANTHROPIC JUST ADMITTED THE ENTIRE CLAUDE CODE LEAK WAS FAKE IT WAS AN APRIL FOOLS PRANK > the "leaked" source code was fabricated > the Mythos model benchmarks were made up > the 3,000 internal documents were planted on purpose anthropic deliberately seeded fake assets in a staging environment they intentionally left unsecured. the npm source map pointed to a completely fabricated codebase 44 fictional feature flags. invented codenames. just enough sloppy details to make it irresistible to post about the tamagotchi pet system. the undercover mode. the engineer named ollie. all fake. they called the project "Capybara" internally because capybaras sit calmly while everything around them escalates they even apologized to cybersecurity researchers at Cambridge and LayerX who spent their entire weekend analyzing documents written on a Thursday afternoon anthropic just pulled off the greatest april fools in tech history. well played

English

351

342

3.5K

811.6K

Greg Pazo รีทวีตแล้ว

Clément Dumas@Butanium_·6d

⚠️ Supply chain attack in progress: someone is squatting Anthropic-internal npm package names targeting people trying to compile the leaked Claude Code source. `color-diff-napi` and `modifiers-napi` — both registered today, same person, disposable email. Do NOT install them. 🧵

English

384

2.2K

302.7K

Greg Pazo@Gregpazo·31 Mar

@DavidKPiano

GIF

QME

399

David K 🎹@DavidKPiano·31 Mar

Ironically this is probably the first time that actual humans are carefully & thoroughly reviewing the Claude Code codebase

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

253

5.1K

201.5K

Greg Pazo@Gregpazo·31 Mar

First thing I was curious about is their a/b testing and oooh boy. Here goes: 900+ flags Uses GrowthBook open-source a/b testing tool Flags are prefixed “tengu_” Flags being tested: Context window size Ultrathink budget and behavior Opus medium effort default Compaction thresholds Evidence injection into prompts Thinking token budget caps Brevity of response controls What did you find interesting in the flags constantly under test? You can also use Claude code with the source code below to examine which cohort you are in—control or test.

Chaofan Shou@Fried_rice

Claude code source code has been leaked via a map file in their npm registry! Code: …a8527898604c1bbb12468b1581d95e.r2.dev/src.zip

English

Greg Pazo รีทวีตแล้ว

Andrej Karpathy@karpathy·31 Mar

New supply chain attack this time for npm axios, the most popular HTTP client library with 300M weekly downloads. Scanning my system I found a use imported from googleworkspace/cli from a few days ago when I was experimenting with gmail/gcal cli. The installed version (luckily) resolved to an unaffected 1.13.5, but the project dependency is not pinned, meaning that if I did this earlier today the code would have resolved to latest and I'd be pwned. It's possible to personally defend against these to some extent with local settings e.g. release-age constraints, or containers or etc, but I think ultimately the defaults of package management projects (pip, npm etc) have to change so that a single infection (usually luckily fairly temporary in nature due to security scanning) does not spread through users at random and at scale via unpinned dependencies. More comprehensive article: stepsecurity.io/blog/axios-com…

Feross@feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios @1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios has 100M+ weekly downloads. Every npm install pulling the latest version is potentially compromised right now. Socket AI analysis confirms this is malware. plain-crypto-js is an obfuscated dropper/loader that: • Deobfuscates embedded payloads and operational strings at runtime • Dynamically loads fs, os, and execSync to evade static analysis • Executes decoded shell commands • Stages and copies payload files into OS temp and Windows ProgramData directories • Deletes and renames artifacts post-execution to destroy forensic evidence If you use axios, pin your version immediately and audit your lockfiles. Do not upgrade.

English

553

1.1K

10.5K

1.4M

Greg Pazo@Gregpazo·30 Mar

@rambling_28 Dear X algorithm. You finally get me. Looks delicious!

English

はぐれリーマン28号🍺🍶@rambling_28·29 Mar

美味そうな肉の写真をアップするとアメリカ人からリプを貰えると聞きましたwww

日本語

7.4K

63K

1.8M

Greg Pazo@Gregpazo·30 Mar

@RhysSullivan I have upstream sync skill that I use. It’s a set of instructions of how I deviated in my fork and what code I want to keep. Other than that I pull in all upstream updates and use the agent to manually cherry pick where upstream is modifying my custom fork

English

338

Rhys@RhysSullivan·29 Mar

there's an interesting new trend with how interact with OSS i'll clone the repos & run them locally, whenever its missing a feature i'll just prompt it to add it did this with t3 code this weekend to add ssh support, cmd + k, fast mode now i have a set of patches i like and want to be able to share them, unapply them, and stay up to date with the remote version the primitives of git / github do make this possible, but it's a very painful flow would love to see someone solve it well

English

340

57.7K

Greg Pazo@Gregpazo·29 Mar

An interesting anecdote here is that review agents work incredibly well in legacy codebases. They can identify and help you prioritize longstanding technical debt that has been holding your software back. Find security vulnerabilities and other “TODO: fix this” items in your codebase the team before you left behind for the next engineer to deal with. Software after the moment it is deployed will rot. The landscape shifts and your product you built is behind in days sometimes hours later.

English

206

Mario Zechner@badlogicgames·29 Mar

ZXX

949

41.8K

Greg Pazo รีทวีตแล้ว

Andrej Karpathy@karpathy·28 Mar

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

English

1.7K

2.4K

31.1K

3.3M

Greg Pazo@Gregpazo·29 Mar

@doodlestein thoughts?

English

Greg Pazo@Gregpazo·29 Mar

What if? Rate limits, throttling, usage limits, etc. are not actually the labs reigning in their coding agents for profit but to use compute for a new model on the way. We’re going to see an exponentially better model from both labs within a month. Mark it.

English

Greg Pazo@Gregpazo·29 Mar

😳

chiefofautism@chiefofautism

someone at ANTHROPIC just showed CLAUDE finding ZERO DAY vulnerabilities in a live conference demo claude has found zero day in Ghost, 50,000 stars on github, never had a critical security vulnerability in its entire, history... it found the blind SQL injection in 90 minutes, stole the admin api key, then did the exact, same thing to the linux kernel

ART

Greg Pazo@Gregpazo·28 Mar

@miguelbetegon @madbyk That is actually better than I expected it to look 👀

English

bete@miguelbetegon·27 Mar

your agents should be monitoring the situation

Burak Yigit Kaya@madbyk

So I'm starting this friendly rivalary between the CLI team and the Web UI team for @sentry dashboards. Which one looks better?

English

3.1K

Greg Pazo รีทวีตแล้ว

Cheng Lou@_chenglou·28 Mar

My dear front-end developers (and anyone who’s interested in the future of interfaces): I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept): Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow

English

1.3K

8.3K

65.1K

23.5M

Greg Pazo@Gregpazo·28 Mar

@doodlestein @badlogicgames @willmcgugan 😂

QME

Jeffrey Emanuel@doodlestein·28 Mar

@badlogicgames @willmcgugan I would contribute my traces, but I use unspeakable language with my clankers that I’d be very embarrassed to have anyone else see…

English

2.9K

Mario Zechner@badlogicgames·28 Mar

we as software engineers are becoming beholden to a handful of well funded corportations. while they are our "friends" now, that may change due to incentives. i'm very uncomfortable with that. i believe we need to band together as a community and create a public, free to use repository of real-world (coding) agent sessions/traces. I want small labs, startups, and tinkerers to have access to the same data the big folks currently gobble up from all of us. So we, as a community, can do what e.g. Cursor does below, and take back a little bit of control again. Who's with me? cursor.com/blog/real-time…

English

178

321

2.5K

261.2K

ค้นพบ

@blended_jpeg @dexhorthy @steipete @ScriptedAlchemy @om_patel5 @DavidKPiano @rambling_28 @RhysSullivan