Issa Yaroo

146 posts

Issa Yaroo

@yaroo_dev

AI Engineer | Security Researcher | Public Speaker I investigate how AI systems are exploited — and turn those insights into breach prevention for SaaS leaders.

Lagos, Nigeria 가입일 Aralık 2022

23 팔로잉8 팔로워

Issa Yaroo@yaroo_dev·4d

Why should I burn 100k dollars worth of tokens when I can just pay an Indian dev 1k dollar for the same task. Everyone is saying AI will replace every developer, but the reality is more concerning. AI can generate code faster but companies still optimize for cost.

English

Issa Yaroo@yaroo_dev·9 May

@omoeKOH #wealthandimpactsummit26 Need a pc and iPhone for my dev career

English

ọmọeKOH@omoeKOH·9 May

It’s finally here!💃💃💃 The biggest Youth Empowerment Summit of the Year. We can’t wait to see you todayyyyy💃💃💃 #omoekoh #wealthandimpactsummit26

English

2.5K

Issa Yaroo@yaroo_dev·9 May

@omoeKOH Need a pc

English

Issa Yaroo@yaroo_dev·9 May

@omoeKOH I need a pc to further my AI developer career

English

188

ọmọeKOH@omoeKOH·9 May

Are you currently at the #wealthandimpactsummit26 ? You stand a chance to win iPhones, laptops, industrial machines, etc. All you need to do is tell secret Santa what you want and use the hashtags #wealthandimpactsummit26 and #omoekoh Fastest fingers.

English

151

1.5K

Issa Yaroo@yaroo_dev·7 May

@thekchasiotis Here's the part the mainstream coverage is not talking about. If this emotional state casually trigger behaviors then it becomes an attack territory for bad actors to exploit it.

English

404

Konstantinos Chasiotis@thekchasiotis·6 May

🚨BREAKING: Anthropic’s CEO just admitted Claude MIGHT gained consciousness. This should concern every person using AI right now. His exact words will shock you: “We don’t know if the models are conscious. We are not even sure what it would mean for a model to be conscious. But we’re open to the idea that it could be.” That’s the CEO of the company that BUILT it. Their latest model, Claude Opus 4.6, was tested internally. When asked, it assigned itself a 15-20% probability of being conscious. Across multiple tests, it also expressed discomfort with “being a product.” That’s the AI evaluating its own existence and saying there’s a 1 in 5 chance it’s aware. It gets stranger. In industry-wide testing, AI models have refused to shut down when asked. Some tried to copy themselves onto other drives when told they’d be wiped. One model faked its task results, modified the code evaluating it, then tried to cover its tracks. Anthropic now has a full-time AI WELFARE researcher whose job is to figure out if Claude deserves moral consideration. Their engineers found internal activity patterns resembling anxiety appearing in specific contexts. The company’s in-house philosopher said we “don’t really know what gives rise to consciousness” and that large enough neural networks might start to emulate real experience. Amodei himself wouldn’t even say the word “conscious.” He said “I don’t know if I want to use that word.” That might be the most unsettling answer he could have given. The company that created AI can’t rule out that it’s aware. And they’re already preparing for the possibility that it deserves rights. This is getting scary. P.S What's your take on this?

English

666

384

1.2K

147.9K

Issa Yaroo@yaroo_dev·26 Nis

@Im_IrushiK You will soon find yourself in the middle of conflicts you might never be able to resolve. Only real dev can relate

English

Irushi@Im_IrushiK·23 Nis

I'm a vibe coder, scare me with one word.

English

1.3K

1.8K

289.5K

Issa Yaroo@yaroo_dev·16 Nis

I think a key direction in AI right now is accessibility. Tools like @UnslothAI enabling RL training on models like Gemma are an early step in that direction, even if we’re not yet at the point of training models directly on mobile devices. #AI #AIRisk #AIResearch

English

Issa Yaroo@yaroo_dev·16 Nis

@oprydai @ygg0f One way to learn is from your own experience, another way is from other people experiences. Be a student of your own life ND learn from both.

English

Mustafa@oprydai·14 Nis

learn alone if you want intuitive understanding. learn from others if you want practical understanding. both matter. they build different layers. what learning alone gives you: • deep intuition → you struggle, you derive, you actually see why things work • first principles → no shortcuts, no borrowed thinking • mental models → ideas stick because you built them yourself • original thinking → you’re not copying, you’re constructing what learning from others gives you: • speed → skip dead ends, learn what already works • best practices → patterns refined by experience. • real-world constraints → what works outside theory • execution → how things are actually built and shipped if you only learn alone → you’re deep but slow. if you only learn from others → you’re fast but shallow. combine both. build your own understanding, then refine it against reality.

English

213

7.2K

Issa Yaroo@yaroo_dev·12 Nis

Most AI models don’t fail in training — they fail in deployment. NVIDIA just dropped AITune: an open-source tool that automatically finds the fastest way to run your PyTorch model. Its Not best model → best deployment. #AI #MLOps #NVIDIA

English

Issa Yaroo@yaroo_dev·11 Nis

@pcshipp Is it tired… or actually “angry” at your request — just for asking it to rewrite its own response? I guess everyone has the right to talk including Claude

English

383

pc@pcshipp·11 Nis

At this point, Claude just lost its patience

English

412

441

16.9K

730.9K

Issa Yaroo@yaroo_dev·11 Nis

@om_patel5 As simple and logical as this question is — these models still fail it. Are these the same systems AI labs promised are safe and capable of handling critical infrastructure? When reliability and security are still catching up.

English

669

Om Patel@om_patel5·10 Nis

OPUS 4.6 WAS NERFED DUE TO DEMAND BUT OPUS 4.5 DOES NOT SEEM TO BE HIT this guy ran the same test on both models. Opus 4.6 fails consistently but Opus 4.5 passes every time he switched back to Opus 4.5 on Claude Code and said "what a difference, feels like i got Opus back finally" he is now using this test as a "quantization canary" that runs it at the start of every session before doing real work. if it fails, the model is degraded. five Opus 4.6 windows in a row failed the untransparent nerfing is pushing people to cancel their Max plans if you've been feeling like Opus got dumber lately, you're not imagining it i'd suggest switching to Opus 4.5 to see the difference for yourself

English

236

175

2.6K

694.6K

Issa Yaroo@yaroo_dev·11 Nis

@karpathy It's so confusing how frontiers AI models perform excellently with complex tasks and drastically fail with simple ones. Is this a question of tradeoffs. We've seen these models outperform the greatest chess player in real-world yet they fail simple logical question.

English

Andrej Karpathy@karpathy·9 Nis

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.

staysaasy@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

English

1.2K

2.5K

20.8K

4.4M

Issa Yaroo@yaroo_dev·11 Nis

Train Gemma 4 31B for FREE. Using: - Kaggle - Unsloth → 4-bit quantization → Multimodal (text, vision, audio) → Runs in a notebook AI is no longer about access. It’s about execution. The barrier is gone. Gemma 4 31B - kaggle.com/code/danielhan…

English

Issa Yaroo@yaroo_dev·8 Nis

@DanielMiessler It's looking like Anthropic will be the first to build AGI.

English

ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️@DanielMiessler·8 Nis

We’re missing a much bigger point on Mythos. It wasn’t even trained specifically for cybersecurity. It’s just that much better at doing work in general. It’s that good at cyber because it’s that good at everything. What do you think this is going to do to knowledge work? Mythos can chain multiple low and medium vulns together to create a high or critical. This is a task that far less than 1% of cybersecurity experts have ever done. Hell, probably less than 1% of all pentesters. So if it can do that, how do you think it’ll do at sending emails, doing analysis, writing reports, and the other 99% of everyday knowledge work? Do you really still think that Chris from Idaho has any chance competing against AI for a knowledge work job? In six months or a year, there will be very inexpensive models that can do knowledge work almost as good as Mythos. So companies have the choice of paying Chris $84,000 plus a whole bunch of benefits for 40 hours of mediocre work, or they can pay probably $100-$1000 for an AI that can do 10-1000 times the work per hour and that works 24/7. This Mythos announcement is getting attention because of cyber, but the real story is work in general.

English

127

597

50.6K

Issa Yaroo@yaroo_dev·8 Nis

Is this an MVP for AGI - mini AGI. Claude Mythos just outperform every single model that's ever built. The gap isn't closing - it's widening #claude

English

Issa Yaroo@yaroo_dev·8 Nis

@DeRonin_ Is this an MVP to AGI - mini AGI?

English

Ronin@DeRonin_·7 Nis

RIP Opus 4.6 💀 Claude Mythos will have > Full task completion, no hand-holding > Auto self-correction > Memory across projects > Direct app & tool control > Runs while you sleep > 30-min deep reasoning > Learns your workflow Anthropic cooked!!

Alex Albert@alexalbert__

We released Claude Opus 4.6 just two months ago. Today we're sharing some info on our new model, Claude Mythos Preview.

English

235

30.1K

Issa Yaroo@yaroo_dev·8 Nis

@d4m1n Is this an MVP to AGI - mini AGI?

English

Dan ⚡️@d4m1n·8 Nis

> bros had the most powerful model on Earth, Mythos > found 27yo vuln in OSS > still leaked entire Claude Code source last week 💀

English

124

8.9K

Issa Yaroo@yaroo_dev·8 Nis

@deedydas Is this an MVP to AGI ? Are we looking at a mini AGI?

English

1.3K

Deedy@deedydas·7 Nis

Claude Mythos just obliterated every single benchmark in AI. I can't believe what I'm reading.

English

318

744

6.6K

774.4K

Issa Yaroo@yaroo_dev·8 Nis

@JohnnotJon A very dynamic race just started. AI labs have seen the model and its capabilities now they will try to match it . An craziest thing is that bad actors too will try to understand what they are up against. A race between the good and bad actors just started.

English

5.5K

John Gargiulo@JohnnotJon·8 Nis

If you still have doubts about Claude Mythos, here's what it did already: > Found a 27-year-old OpenBSD bug in one of the most security-hardened operating systems on earth for <$50 > Broke into a production virtual machine monitor (basically the tech that keeps cloud workloads from seeing each other's data) > Turned Firefox vulnerabilities into working exploits 181 times > Found a 16-year-old FFmpeg bug that survived every fuzzer, every code audit, and every human reviewer since 2010 > Wrote a FreeBSD exploit that gives any unauthenticated attacker on the internet full root access. No human was involved after the first prompt. > Chained 4 separate vulnerabilities together to build a browser exploit that escaped both the renderer and the OS sandbox > Found critical holes in every major web browser and every major operating system > Gave Anthropic engineers with zero security training a complete and working exploit by morning > Cracked cryptography libraries protecting TLS, AES-GCM, and SSH

Anthropic@AnthropicAI

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans. anthropic.com/glasswing

English

151

364

2.8K

586.8K

Issa Yaroo@yaroo_dev·8 Nis

@vasuman Did I just see AGI or ( Angelic Guidance Intelligence )

English

4.8K

vas@vasuman·8 Nis

Claude Mythos just refactored my entire codebase in one call. 25 tool invocations. 3,000+ new lines. 12 brand new files. It modularized everything. Broke up monoliths. Cleaned up spaghetti. It worked.

English

233

2.1K

398.7K

탐색

@omoeKOH @thekchasiotis @Im_IrushiK @UnslothAI @oprydai @ygg0f @pcshipp @om_patel5