cat

144 posts

cat

@Ryan_Jarv

Seattle, WA Katılım Ekim 2011

458 Takip Edilen423 Takipçiler

cat@Ryan_Jarv·1d

@halvarflake (which reminds me I gotta look into that still, almost forgot)

English

cat@Ryan_Jarv·1d

@halvarflake For me it was the thing that it told me I was wrong, and I got mad, but then I slept and realized I was wrong but out of sheer coincidence I ended up being right in the end

English

Halvar Flake@halvarflake·1d

Example scenarios where Claude was extremely stupid in the last days: 1) Arguing that a change that moved work into multiple Python processes made GIL contention worse because the total number of CPU-seconds spent waiting for the GIL had gone up.

English

7.5K

cat@Ryan_Jarv·1d

@nmatt0 @h4x1n_dev Be careful it’s an expensive addiction

English

Matt Brown@nmatt0·1d

@h4x1n_dev 5x. Might have to upgrade 😅

English

186

Matt Brown@nmatt0·1d

Cutting it close today...

English

2.6K

cat@Ryan_Jarv·2d

@glcst Granted, I suppose I can see myself in the mood for one and not the other. So maybe not exactly comparable.

English

cat@Ryan_Jarv·2d

@glcst I think yakiniku is kinda already that tbh. Not the same, but, I have a hard time imagining anything better.

English

Glauber Costa@glcst·2d

Is there anything that is good in this world that the Japanese cannot perfect? Just bring them to Texas and get them BBQ as they all desire. Then just wait 5 years. And the world will know BBQ to a level unimaginable today.

English

823

cat@Ryan_Jarv·2d

@Mh56199612 @brunoborges Reads more like a spec I imagine, one can be interpreted differently than the other.

English

Majd Haidar@Mh56199612·3d

@brunoborges Why NEVER and not Never, interpreted as a more strict order ?🤔

English

603

Bruno Borges@brunoborges·3d

Best part of Claude Code's source is this system prompt. #L43-L54" target="_blank" rel="nofollow noopener">github.com/alex000kim/cla…

English

9.8K

cat@Ryan_Jarv·2d

@michael_timbs Hate how it’s so hard to know, but tbh likely also A/B testing

English

cat@Ryan_Jarv·2d

@michael_timbs Actually probably was me now that I think about it.

English

Michael Timbs@michael_timbs·2d

Convinced Anthropic diverted all their compute towards agents building DMCA Takedown Notices for hosting leaked source code. Explains poor model perf

English

106

cat@Ryan_Jarv·2d

Ahg I’ve forgotten how to be patient enough to do shit myself instead delegating to Claude

English

cat@Ryan_Jarv·3d

@Noahpinion But I guess that’s just my opinion

English

cat@Ryan_Jarv·3d

@Noahpinion Either way it will be fine

English

Noah Smith 🐇🇺🇸🇺🇦🇹🇼@Noahpinion·4d

Let's hope AI cyber defense beats AI cyber offense, or the internet age is over

Lukasz Olejnik@lukOlejnik

LLMs can now autonomously find serious zero-day security vulnerabilities, create exploit, and use them to attack (hack) a target. This will only improve, and AI will be better at this than any human. youtube.com/watch?v=1sd26p…

English

15.7K

cat@Ryan_Jarv·4d

Tbh it’s really not that hard, but might need to design things around the concept for it to be useful I think

English

cat@Ryan_Jarv·4d

Anyway, hard to see how that could work outside of specific changes and approval + review, and if it would even be useful, but still.. super interested if anyone has tried

English

cat@Ryan_Jarv·4d

Imagine if you could request changes to a platform as a user and have them implemented and rolled out in less than an hour.

English

cat@Ryan_Jarv·5d

@nmatt0 @HackerOn2Wheels @rez0__ @ctbbpodcast massive if true

English

Matt Brown@nmatt0·6d

@HackerOn2Wheels @rez0__ advice from the @ctbbpodcast episode helps with that. Put stuff in your Claude.md like "POC||GTFO", etc. I've also been starting to document a standard thread model via markdown I can give it to sometimes prevent it from over hyping something.

English

837

HackerOnTwoWheels@HackerOn2Wheels·6d

My biggest problem with Claude and hacking so far is it tends to be super hyperbolic. Everything is CRITICAL, makes up risks for trivial stuff. I keep having to explain to it real world risks as it wants to “finish” the job with shitty findings.

English

5.6K

cat@Ryan_Jarv·27 Mar

@hankein95 @thegrugq Cool will look into this soon. Just curious, have you seen many due to silently failing or falling back to old behavior? Idk if that would even be tracked… but I feel like that might be a thing in the near future.

English

121

Hanqing Zhao@hankein95·25 Mar

We've been tracking public CVEs where AI-generated code introduced the vulnerability. vibe-radar-ten.vercel.app 50k+ advisories scanned. Dozens of confirmed cases so far. Claude Code, Copilot, Cursor, and others all show up. Common bug classes include XSS, command injection, SSRF, and path traversal. And these are just the cases that leave metadata traces. The real number is almost certainly higher. Open source, from Georgia Tech SSLab: github.com/HQ1995/vibe-se…

English

345

34.7K

cat@Ryan_Jarv·27 Mar

@nmatt0 Yeah. It always comes off as more of a long standing personal excuse to themselves to not try.

English

Matt Brown@nmatt0·26 Mar

I kinda pride myself on being able to explain technical stuff to non-technical people. There's nothing that pisses me off quite like a non-technical person assuming they won't understand a word you say, throwing up their hands and exclaiming that they were immutably born lacking the ability to understand tech.

Jack Rhysider 🏴‍☠️@JackRhysider

Thanks for mentioning me Wired! But something I want to tell you is nobody has ever complained to me the show is too technical (except you somehow?) People are lot more tech savvy than you realize. 9 year olds and grandparents listen and love it.

English

4.2K

Keşfet

@halvarflake @nmatt0 @h4x1n_dev @glcst @Mh56199612 @brunoborges @michael_timbs @Noahpinion