harisec

3.9K posts

harisec

@har1sec

Interested in web security, bug bounties, machine learning and investing. SolidGoldMagikarp. Orson Kovacs.

SolidGoldMagikarp Katılım Eylül 2010

2.8K Takip Edilen8.4K Takipçiler

Sabitlenmiş Tweet

harisec@har1sec·3 Ara

I wrote a blog post about how I use Claude Code (and other models) in my work: invicti.com/blog/security-…

English

harisec retweetledi

Andrej Karpathy@karpathy·10 Mar

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

963

2.1K

19.3K

3.5M

harisec@har1sec·7 Mar

@caseyjohnellis In practice it doesn't really matter, there is more than enough public security material for LLMs on the public net. The latest models like Opus 4.6 are insanely great on security tasks

English

239

cje@caseyjohnellis·7 Mar

hi. i call bullshit. i look forward to completely ignoring the response.

Critical Thinking - Bug Bounty Podcast@ctbbpodcast

Is H1 Using OUR Reports To Train LLMs?

English

19.3K

harisec@har1sec·4 Mar

@tqbf IMO it's only a matter of time before every security researcher who's being honest with themselves says the same.

English

1.9K

harisec retweetledi

Thomas H. Ptacek@tqbf·4 Mar

Nicholas Carlini at [un]prompted. If you know Carlini, you know this is a startling claim.

English

143

1.3K

194.5K

harisec retweetledi

Daniel Cuthbert@dcuthbert·2 Mar

Everyone today is a hacker in a sense but there are very few OG hackers on which shoulders we stand Oh dude, Felix “FX” Lindner you were so much a hackers hacker and you will be missed RIP my friend and thank you

English

134

582

75.8K

harisec@har1sec·15 Şub

@zseano @Mohnad IMO, in a few years it will be a battle royale of the AI bots, the person who has the cheapest/fastest/more intelligent AI bot will make most of the money. Humans will find still the most clever bugs (for a while) but don't make enough money to be a full time job.

English

220

zseano@zseano·14 Şub

@Mohnad we are a few years away in my opinion. but i honestly think in 5 years this industry is going to look very different. still time to make a couple mill from bug bounties , but imo, this opportunity isn't going to be around forever

English

2.8K

zseano@zseano·14 Şub

one day we will look back at bug bounty days and think “damn… we had it good”

English

277

15.1K

harisec@har1sec·12 Şub

@senorarroz Maybe not today but if you have that wording in your TOS, it's just a matter of time until it happens.

English

1.1K

Alex Rice@senorarroz·12 Şub

Not all AI is created equal! ❌Training GenAI on researcher submissions: No. docs.hackerone.com/en/articles/10… ❎Good ol' ML models: Yes, for 10+ years under our terms. We hear y'all -- making our terms clearer on this new distinction is coming. Thanks for keeping us transparent.

zseano@zseano

@ahacker1_h1 @Radiowebcc Wow… I thought h1 said they were not using our data to train their AI model. I’m going to ask h1 to clarify 🧐

English

9.6K

harisec retweetledi

Kling AI@Kling_ai·5 Şub

Kling 3.0 is truly "one giant leap for AI video generation"! Check out this amazing mockumentary from Kling AI Creative Partner Simon Meyer!

English

176

491

5.6K

1.9M

harisec retweetledi

Boris Cherny@bcherny·27 Ara

When I created Claude Code as a side project back in September 2024, I had no idea it would grow to be what it is today. It is humbling to see how Claude Code has become a core dev tool for so many engineers, how enthusiastic the community is, and how people are using it for all sorts of things from coding, to devops, to research, to non-technical use cases. This technology is alien and magical, and it makes it so much easier for people to build and create. Increasingly, code is no longer the bottleneck. A year ago, Claude struggled to generate bash commands without escaping issues. It worked for seconds or minutes at a time. We saw early signs that it may become broadly useful for coding one day. Fast forward to today. In the last thirty days, I landed 259 PRs -- 497 commits, 40k lines added, 38k lines removed. Every single line was written by Claude Code + Opus 4.5. Claude consistently runs for minutes, hours, and days at a time (using Stop hooks). Software engineering is changing, and we are entering a new period in coding history. And we're still just getting started..

English

900

1.8K

20.6K

4.6M

harisec@har1sec·27 Ara

@karpathy This is very surprising, I just asked Claude Opus to explain LSP and i pasted Karpathy's tweet without any mentions of Karpathy and it inferred it was a tweet from him. What is happening here? claude.ai/share/50e3a1f7…

English

404

harisec@har1sec·27 Ara

@wunderwuzzi23 Good luck with your talk, i'm sure it will be great. I'm in Hamburg but didn't manage to get a 39c3 ticket :(

English

271

Johann Rehberger@wunderwuzzi23·27 Ara

creating some new last minute artwork for my CCC talk tomorrow going Goethe's sorcerer's apprentice style

English

1.3K

harisec@har1sec·23 Ara

@VictorTaelin It was me

English

204

Taelin@VictorTaelin·23 Ara

TBH, every time the AI fails, I mentally blame it on you. Right now GPT-5.2 noticed that the parser was counting variables incorrectly, causing a linearity bug. The solution? "Ignore the parser counter and implement a separate counter." At this point, this isn't about being dumb. This is about making a bad decision that under no circumstances would be good. Either we remove the parser counter and use a separate function as the source of truth, or we keep it, and fix it. But such insane duct taping has no place in a serious codebase, and that idea would never have occurred to an intelligence evolved to learn coding from a pure blank state. It must have been corrupted by evil forces that only humans can produce. So I can't help but wonder... Who it learned that from? I blame it on you

English

618

42K

harisec@har1sec·20 Ara

@moyix lol, i know the feeling. it’s production ready all over again

English

311

Brendan Dolan-Gavitt@moyix·20 Ara

Claude has now claimed it found the "smoking gun" in these log files about half a dozen times. We need better gun control, or at least an anti-smoking campaign for the guns so they don't get lung cancer

English

3.4K

harisec retweetledi

Ivan at Wallarm / API security solution@d0znpp·19 Ara

Looking for security researcher with great public profile. Remote. API / AI exploits focus on novel techniques. No XSSers please ;) reply here or DM. Please repost

English

5.2K

harisec@har1sec·19 Ara

@damian_89_ @gergely_kalman That’s a moronic reason to reduce the reward.

English

443

Damian Strobel@damian_89_·19 Ara

@gergely_kalman Yep I was able to exfil... Saw everything. Nevertheless it's indeed on BugCrowd - bugcrowd.com/engagements/in…

English

8.2K

Damian Strobel@damian_89_·19 Ara

So what do you bug bounty guys say? I found a complex chain (auth bypass, code injection -> code exec in gitlab ci -> exfil secrets (prod) via http) - got just 50% of P1 full payout because I am told that some internal system reported immidiately ... fair? not fair?

English

8.5K

harisec retweetledi

Marius Avram@securityshell·5 Ara

Holy shit… the exploitation of CVE-2025-55182 has reached a new level. There’s now a publicly available Chrome extension on GitHub that automatically scans for and exploits vulnerable sites as you browse. Absolutely wild. 🤦‍♂️

English

412

3.5K

548.7K

harisec@har1sec·4 Ara

@stdoutput Thank you for publishing your analysis, finally some real information not just AI slop

English

2.2K

harisec retweetledi

Moritz Sanft@stdoutput·4 Ara

Since I started to analyze CVE-2025-55182 (React, NextJS RCE) at work today, I decided to publish my analysis findings so far, given all the fuzz about the vulnerability: github.com/msanft/CVE-202… Feel free to contribute to the search for a proper RCE sink!

English

352

100.7K

harisec retweetledi

shubs@infosec_au·4 Ara

Our Security Research team at @SLCyberSec just published a high-fidelity detection mechanism for the Next.js/RSC RCE (CVE-2025-55182 & CVE-2025-66478) - slcyber.io/research-cente…. There are a lot of PoCs on GitHub that are adding noise to the problem; I hope this helps people!

English

345

42.9K

Keşfet

@caseyjohnellis @tqbf @zseano @Mohnad @senorarroz @karpathy @wunderwuzzi23 @VictorTaelin