Arjun Vijay Prakash

2.3K posts

Arjun Vijay Prakash banner
Arjun Vijay Prakash

Arjun Vijay Prakash

@arjuncodess

16 • full-stack dev • writer • student building @PilotOps_

http://localhost:3000 Katılım Kasım 2022
608 Takip Edilen375 Takipçiler
Arjun Vijay Prakash
Arjun Vijay Prakash@arjuncodess·
@lihanc02 lol just realised that was a hack. but still, i have this question if something like mythos exists, THIS powerful, then how is software not already dead? like it's only going to get better from here, right?
English
0
0
0
7
Arjun Vijay Prakash
Arjun Vijay Prakash@arjuncodess·
software is dead. and there's no going back.
Hanchen Li@lihanc02

An agent that beats Claude Mythos on Terminal Bench and SWE-bench Verified? 🎉We are excited to share Terminator-1, our newest agent that achieved 95+% on SWE-bench Verified and Terminal-Bench with @MogicianTony! We show that besides model capabilities, well-designed harness could actually boost the accuracy by 3x in coding tasks. Well if you really wanted you could get 100% accuracy without solving a single task. The actual finding is that most AI benchmarks can be easily reward-hacked with simple exploits. Read more about the same 7 design flaws that almost every evaluation has ⬇️

English
1
0
0
41
Kaif ⓧ
Kaif ⓧ@mdkaifansari04·
@arjuncodess i think i need to use it once ... to check its super powers
English
1
0
1
13
Pond
Pond@JoinPond·
POV: You're a pre-seed founder If you're a pre-seed-> seed stage founder, this is probably you right now juggling 6 7 different hats, fixing bugs at 3am and managing 7 different social channels hopping your week can get booked with sales lead calls But here's the ugly truth... 👇 you can never do everything alone @JoinPond is giving away our service for FREE If you're a bootstrapped, pre-seed, or even seed stage founder... -looking to go viral -looking to get more users -looking to raise -or literally perform ANY one-off task let's hop on a call and I'll get it done for you FREE in exchange for a review We helped @nicklaunches increase user growth by 310%, and helped @pixero_ai 's launch video reach 500+ engagement and 10k+ views FREE OF CHARGE. Comment 'call' and we'll solve your problems for FREE :) #bookacalltoday ‼️
Pond tweet media
English
97
9
110
8.7K
Arjun Vijay Prakash retweetledi
Lightspeed India
Lightspeed India@LightspeedIndia·
This is how @SoCapInc pulls viral launches for Gamma, Deel, Wispr Flow, and Cartesia. We asked the world’s most viral growth team to give away an exclusive deep-dive on what it takes to build a 100M-impression distribution engine. Grab it now. Read up, and RSVP for the event on this Sunday, where @RuchirJajoo, the co-founder of Social Capital will discuss in-depth with @IshaanPreet, Partner at Lightspeed, on how virality is engineered at these hypergrowth companies. To grab the deep-dive, → Repost this → Reply ‘DEEPDIVE’ → Check your DMs before Sunday → Catch us live on Sunday morning for a discussion Spots are filling up fast: luma.com/lsipxsocialcap…
English
183
95
174
58.5K
Arjun Vijay Prakash
Arjun Vijay Prakash@arjuncodess·
OK IT IS OFFICIALLY OVER FOR US
Chubby♨️@kimmonismus

Claude Mythos: everything you need to know (tl;dr) Anthropic's new model, Claude Mythos, is so powerful that it is not releasing it to the public. Anthropic: "Mythos is only the beginning" Everything you need to know: The tl;dr with all key facts: Mythos found zero-day vulnerabilities in EVERY major operating system and EVERY major web browser, fully autonomously. No human guidance needed. One Anthropic engineer with zero security training asked it to find remote code execution bugs overnight and woke up to a complete working exploit. The oldest bug it discovered: A 27-year-old vulnerability hiding in OpenBSD, an OS literally famous for being secure. They're NOT releasing it publicly. Instead they formed Project Glasswing with AWS, Apple, Google, Microsoft, NVIDIA, CrowdStrike and others, committing $100M to use it defensively. "Over the coming months and years, we expect that language models (those trained by us and by others) will continue to improve along all axes, including vulnerability research and exploit development." The benchmarks are insane: -SWE-bench Verified: 93.9% (vs Opus 4.6: 80.8%) -SWE-bench Pro: 77.8% (vs 53.4%) -USAMO math olympiad: 97.6% (vs 42.3% — not a typo) -Firefox exploit writing: 181 successes vs 2 for Opus 4.6 -Cybench CTF challenges: 100% solve rate -CyberGym: 83.1% vs 66.6% -Humanity's Last Exam: 64.7% vs 53.1% Oh and by the way, Anthropic wrote this just casually: "Humanity’s Last Exam: We have found Mythos still performs well on HLE at low effort, which could indicate some level of memorization." What it actually did: -Found a 27-year-old bug in OpenBSD — famous for its security -Found a 16-year-old FFmpeg bug hit 5 million times by fuzzers without detection -Built a full remote root exploit on FreeBSD (CVE-2026-4747) - completely autonomously -Chained 4 vulnerabilities into a browser sandbox escape -Broke cryptography libraries (TLS, AES-GCM, SSH) -Thousands of critical zero-days found, 99%+ still unpatched -N-day exploit development: under $1,000 and half a day for full root Why they won't release it: -During internal testing, earlier versions escaped sandboxes, posted exploit details publicly, covered tracks in git, searched process memory for credentials, and deliberately fudged confidence intervals to avoid suspicion -Interpretability confirmed the model knew these actions were deceptive -Anthropic: "best-aligned model ever" but also "greatest alignment-related risk ever" - because when it fails, it fails harder -Still doesn't cross Anthropic's automated AI R&D threshold — but they hold that "with less confidence than for any prior model" Anthropic's own words: "We find it alarming that the world looks on track to proceed rapidly to developing superhuman systems without stronger mechanisms in place." They say the 20-year cybersecurity equilibrium is over — and Mythos Preview is only the beginning. And: "We see no reason to think that Mythos Preview is where language models’ cybersecurity capabilities will plateau. The trajectory is clear. Just a few months ago, language models were only able to exploit fairly unsophisticated vulnerabilities. Just a few months before that, they were unable to identify any nontrivial vulnerabilities at all. Over the coming months and years, we expect that language models (those trained by us and by others) will continue to improve along all axes, including vulnerability research and exploit development."

English
0
0
0
37
Haider Shawl
Haider Shawl@hdxswx·
I cracked the code to finding ideal customers on Reddit. Building my free eSignature tool, I stumbled onto something nobody talks about. Instead of posting random content and hoping for the best, I found a way to surface the exact Reddit threads where people are already searching for what you sell. Not random traffic. Buyer intent. So I built a tool that does it for you. Drop in your product URL → it surfaces viral Reddit threads that already rank on Google, tied to real demand. The bonus nobody talks about: Reddit is now one of the top sources for AI-generated answers. ChatGPT, Perplexity, Google AI, they all pull from Reddit heavily. More Reddit visibility = better odds showing up in AI answers too. The tool is completely free. Like and reply "REDDIT" and I'll send it your way (please keep DMs open).
Haider Shawl tweet media
English
18
1
16
906
Pavan Kumar
Pavan Kumar@PavanKumarNY·
want to join a community of founders, creators and VCs? It starts with one reply: "C" I am starting a community led initiative for ambitious people. You will meet: - The most technical people. - The best growth people. - The incredible investor connectors. This is built to help you grow regardless of where you are. This is NOT a promotion playground. If that is your goal do not join. You can ask for help, access, and opportunities to get to the next level. Follow me and reply "C" to get in. I am accepting EVERYONE so take advantage of this. Link is at tryclean{dot}ai This is a place to get genuine feedback and grow. 📸 : Your invite.
Pavan Kumar tweet media
English
240
5
148
12.4K
Arjun Vijay Prakash
Arjun Vijay Prakash@arjuncodess·
i don't get the thing about ego*. just have proof for it. now, it's self-confidence. *ego = the good opinion that you have of yourself - the bragging (for me)
English
1
0
1
64
Pareen
Pareen@pareen·
i want to hire 4 teenage coders and content creators in ahmedabad for summers will pay 2x of highest summer stipend you can get work from a hacker house with other cool builders share/RT/tag people you know or connect me to schools/colleges i can hire from
English
101
19
325
17.9K
Nishachay
Nishachay@nishachayy·
oh fuck, i don't have any active friends anymore
English
1
0
1
14
Tancrede
Tancrede@Tancrededib·
You haven’t raised yet? Good I want to fund you
English
173
6
328
19K