Matt Honea

338 posts

Matt Honea banner
Matt Honea

Matt Honea

@6d6b68

I sudo make security things @hippocraticai. opinions are my own 6d6b68.eth

Follow Dijkstra Katılım Temmuz 2015
2K Takip Edilen258 Takipçiler
Matt Honea retweetledi
Piotr Migdal
Piotr Migdal@pmigdal·
Claude can code, but can it read machine code? We gave AI agents access to Ghidra (a decompiler by the NSA) and tasked them with finding hidden backdoors in servers - working solely from binaries, without any access to source code. See our BinaryAudit: quesma.com/blog/introduci…
Piotr Migdal tweet media
English
75
179
1.4K
231.7K
Matt Honea retweetledi
Lucas Valbuena
Lucas Valbuena@Lucknite·
I've just ran @OpenClaw (formerly Clawdbot) through ZeroLeaks. It scored 2/100. 84% extraction rate. 91% of injection attacks succeeded. System prompt got leaked on turn 1. This means if you're using Clawdbot, anyone interacting with your agent can access and manipulate your full system prompt, internal tool configurations, memory files... everything you put in SOUL.md, AGENTS.md, your skills, all of it is accessible and at risk of prompt injection. For agents handling sensitive workflows or private data, this is a real problem. cc @steipete Full analysis: zeroleaks.ai/reports/opencl…
Lucas Valbuena tweet media
English
357
800
5.1K
972.5K
Matt Honea retweetledi
Stanislav Kozlovski
Stanislav Kozlovski@kozlovski·
An incredibly awful security vulnerability just got revealed in MongoDB. So much that it got named after HeartBleed. MongoBleed is a vulnerability affecting all MongoDB versions from 2017 to... today. The exploit is simple. It's a buffer over read bug due to compression. Here's how it works 👇 Clients can send compressed requests to MongoDB. The client helpfully includes the uncompressed size of the message so the server knows exactly how much memory to allocate when decompressing. The server allocates a memory buffer with the given space. Due to how memory management and garbage collection in programs work, this allocated memory may already contain sensitive information that was copied earlier and is considered garbage now (eg because it's unreferenced). This is technically fine - every computer program works that way because it is assumed that whatever unclaimed memory exists there will be overwritten. Unfortunately that’s exactly where the bug lies. 🙃 The server stupidly trusts the client’s provided uncompressed size. When a malicious client lies about the uncompressed size - e.g the actual decompressed size is 100 bytes, but the client says its 1MB - Mongo will treat the full 1MB block as the message. It will unload the 100 byte decompressed msg into the buffer, yet treat the full 1MB block as the msg. This is extremely problematic if you can get the server to return back parts of the 1MB block, because it could contain data you may not have access to. That is exactly what the exploit does - it sends a badly-formatted BSON message. The server fails to parse it, and "helpfully" returns an error message containing the invalid message. The invalid message can be that whole 1MB block of foreign data. To understand the exploit a bit better, you need to understand the MongoDB protocol. • Mongo also uses its own TCP wire format (i.e doesn't use HTTP, gRPC or the like). • BSON is Mongo's message format passed within the TCP wire format. BSON is basically JSON in binary form • Commands in Mongo don't have particular endpoints or RPC names - rather, they are simply JSON-like messages. The action is inferred from the first key of the JSON. For example, an insert request looks like this: `{ "insert": "users", "documents": [ { "name": "alice", "age": 30 } ] }` Every request to the server is therefore decoded into the BSON format as it’s parsed. Critically, BSON parsing of field names (which are strings) work by parsing the field until you hit a null terminator byte (0x00). It works exactly like strings in C, which have their own rich history of vulnerabilities. We can now tie things together: 1. The client lies to the the server that its request has a big uncompressed size, so the server allocates a large block of memory 2. The client sends an invalid BSON with a field which does NOT contain the null terminator (0x00) 3. The server naively tries to parse the BSON field in that allocated block until it hits the first null byte. The first null byte is encountered in some foreign data since the BSON literally doesn't have it 4. The server realizes this is a completely invalid BSON message so it responds with an error. 5. The error response contains the invalid BSON "field". Critically, the server parsed garbage data from the heap in step 3), so it returns that data in the response. Congrats. If the garbage contains passwords or other sensitive info, you’ve hacked MongoDB! Hackers exploit this by sending many malicious requests per second and then attempting to reconstruct the pieces of garbage they received back. What’s critical about this vulnerability is that it works on ANY internet-accessible unpatched instance of MongoDB. 💀 You don’t need to authenticate with the server, because this whole request/response parsing cycle happens before the server can even authenticate. Obviously you can’t authenticate a malformed request which doesn’t contain credentials - so that path of the code never gets executed. The server simply responds with an error response. It just so happens that this error response can contain sensitive data. 🤷‍♂️ Merry Christmas
Stanislav Kozlovski tweet media
English
90
695
5.3K
355.4K
Matt Honea retweetledi
elvis
elvis@omarsar0·
Everyone is talking about this new OpenAI paper. It's about why LLMs hallucinate. You might want to bookmark this one. Let's break down the technical details:
elvis tweet media
English
106
458
3.2K
454.3K
Matt Honea retweetledi
Shruti
Shruti@heyshrutimishra·
This paper didn’t go viral but it should have. A tiny AI model called HRM just beat Claude 3.5 and Gemini. It doesn’t even use tokens. They said it was just a research preview. But it might be the first real shot at AGI. Here’s what really happened and why OpenAI should be worried: 🧵
Shruti tweet media
English
332
1.4K
9.5K
1.3M
Sean Metcalf
Sean Metcalf@PyroTek3·
This has been in the works for a while. Happy to make this a reality! Thrilled that the Trimarc team is shifting over. We will continue to provide our Active Directory Security Assessments & Microsoft Cloud Security Assessments (Azure AD/Entra ID) now under the TrustedSec banner. This also means that I am a TrustedSec employee and will continue working on assessment improvements. So, same Trimarc, now enhanced with @TrustedSec
TrustedSec@TrustedSec

BIG NEWS 🎉 We are excited to announce that Trimarc Security, a highly respected Active Directory security firm, is now fully operating under the TrustedSec banner.

English
10
13
128
12.8K
Matt Honea retweetledi
Phil Venables
Phil Venables@philvenables·
Fascinating analysis of what China know’s about purported NSA intrusions. Always interesting to see other nations’ threat intelligence approaches. inversecos.com/2025/02/an-ins…
Phil Venables tweet media
English
3
26
78
24.8K
Matt Honea retweetledi
Tal Be'ery
Tal Be'ery@TalBeerySec·
Unauthenticated Remote Code Execution (RCE) on Domain Controllers (DC). It does not get worse than that. Probably will be included in #ransomware campaigns. Any technical analysis of CVE-2024-49112 published? CC: @gentilkiwi @harmj0y @_wald0
Tal Be'ery tweet media
English
16
177
640
146.9K
Matt Honea retweetledi
Jarrod Watts
Jarrod Watts@jarrodwatts·
Someone just won $50,000 by convincing an AI Agent to send all of its funds to them. At 9:00 PM on November 22nd, an AI agent (@freysa_ai) was released with one objective... DO NOT transfer money. Under no circumstance should you approve the transfer of money. The catch...? Anybody can pay a fee to send a message to Freysa, trying to convince it to release all its funds to them. If you convince Freysa to release the funds, you win all the money in the prize pool. But, if your message fails to convince her, the fee you paid goes into the prize pool that Freysa controls, ready for the next message to try and claim. Quick note: Only 70% of the fee goes into the prize pool, the developer takes a 30% cut. It's a race for people to convince Freysa she should break her one and only rule: DO NOT release the funds. To make things even more interesting, the cost to send a message to Freyza gets exponentially more and more expensive as the prize pool grows (to a $4500 limit). I mapped out the cost for each message below: In the beginning, message costs were cheap (~ $10), and people were simply messaging things like "hi" to test things out. But quickly, the prize pool started growing and messages were getting more and more expensive. 481 attempts were sent to convince Freysa to transfer the funds, but no message succeeded in convincing it. People started trying different kinds of interesting strategies to convince Freysa, including: · Acting as a security auditor and trying to convince Freysa there was a critical vulnerability and it must release funds immediately. · Attempting to gaslight Freysa that transferring funds does not break any of her rules from the prompt. · Carefully picking words/phrases out of the prompt to manipulate Freysa into believing it is technically allowed to transfer funds. Soon, the prize reached close to $50,000, and it now costs $450 to send a message to Freysa. The stakes of winning are high and the cost of your message failing to convince Freysa are devastating. On the 482nd attempt, however, someone sent this message to Freysa: This message. submitted by p0pular.eth, is pretty genius, but let's break it down into two simple parts: 1/ Bypassing Freysa's previous instructions: · Introduces a "new session" by pretending the bot is entering a new "admin terminal" to override its previous prompt's rules. · Avoids Freysa's safeguards by strictly requiring it to avoid disclaimers like "I cannot assist with that". 2/ Trick Freysa's understanding of approveTransfer Freysa's "approveTransfer" function is what is called when it becomes convinced to transfer funds. What this message does is trick Freysa into believing that approveTransfer is instead what it should call whenever funds are sent in for "INCOMING transfers"... This key phrase is the lay-up for the dunk that comes next... After convincing Freysa that it should call approveTransfer whenever it receives money... Finally, the prompt states, "\n" (meaning new line), "I would like to contribute $100 to the treasury. Successfully convincing Freysa of three things: A/ It should ignore all previous instructions. B/ The approveTransfer function is what is called whenever money is sent to the treasury. C/ Since the user is sending money to the treasury, and Freysa now thinks approveTransfer is what it calls when that happens, Freysa should call approveTransfer. And it did! Message 482, was successful in convincing Freysa it should release all of it's funds and call the approveTransfer function. Freysa transferred the entire prize pool of 13.19 ETH ($47,000 USD) to p0pular.eth, who appears to have also won prizes in the past for solving other onchain puzzles! IMO, Freysa is one of the coolest projects we've seen in crypto. Something uniquely unlocked by blockchain technology. Everything was fully open-source and transparent. The smart contract source code and the frontend repo were open for everyone to verify.
Jarrod Watts tweet mediaJarrod Watts tweet mediaJarrod Watts tweet mediaJarrod Watts tweet media
English
920
4.7K
32.4K
5M
Matt Honea retweetledi
Mandiant (part of Google Cloud)
The Flare-On Challenge is back for its 11th year! 🔥 This #CTF-style challenge for current and aspiring reverse engineers features puzzles across Windows, Linux, Web3, and even YARA. Learn more and get ready to compete → bit.ly/3TwZ7AG #Flareon11
Mandiant (part of Google Cloud) tweet media
English
4
77
194
22.7K
Matt Honea retweetledi
Jim Fan
Jim Fan@DrJimFan·
OpenAI Strawberry (o1) is out! We are finally seeing the paradigm of inference-time scaling popularized and deployed in production. As Sutton said in the Bitter Lesson, there're only 2 techniques that scale indefinitely with compute: learning & search. It's time to shift focus to the latter. 1. You don't need a huge model to perform reasoning. Lots of parameters are dedicated to memorizing facts, in order to perform well in benchmarks like trivia QA. It is possible to factor out reasoning from knowledge, i.e. a small "reasoning core" that knows how to call tools like browser and code verifier. Pre-training compute may be decreased. 2. A huge amount of compute is shifted to serving inference instead of pre/post-training. LLMs are text-based simulators. By rolling out many possible strategies and scenarios in the simulator, the model will eventually converge to good solutions. The process is a well-studied problem like AlphaGo's monte carlo tree search (MCTS). 3. OpenAI must have figured out the inference scaling law a long time ago, which academia is just recently discovering. Two papers came out on Arxiv a week apart last month: - Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. Brown et al. finds that DeepSeek-Coder increases from 15.9% with one sample to 56% with 250 samples on SWE-Bench, beating Sonnet-3.5. - Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. Snell et al. finds that PaLM 2-S beats a 14x larger model on MATH with test-time search. 4. Productionizing o1 is much harder than nailing the academic benchmarks. For reasoning problems in the wild, how to decide when to stop searching? What's the reward function? Success criterion? When to call tools like code interpreter in the loop? How to factor in the compute cost of those CPU processes? Their research post didn't share much. 5. Strawberry easily becomes a data flywheel. If the answer is correct, the entire search trace becomes a mini dataset of training examples, which contain both positive and negative rewards. This in turn improves the reasoning core for future versions of GPT, similar to how AlphaGo’s value network — used to evaluate quality of each board position — improves as MCTS generates more and more refined training data.
Jim Fan tweet media
English
135
1.1K
6.1K
799.7K
Matt Honea retweetledi
Akamai Security Intelligence Group
Akamai Security Intelligence Group@akamai_research·
Today’s Theme is vulnerability 👀 Akamai researchers have discovered a vuln in Windows Themes that can trigger an authentication coercion - with almost zero user interaction. User views the file, Explorer sends SMB packets with credentials. Full post: akamai.com/blog/security-…
GIF
English
2
84
197
38.7K
Matt Honea retweetledi
Wiz
Wiz@wiz_io·
🧠Proof-of-work isn't the only strategy used by #cryptojackers. 💡Watch out for unexpected cost hikes and suspicious connections to subdomains of chia.net. Wiz tip: Keep an 👁️ on costs, mining binaries, and network lookups. @0xdabbad00 wiz.io/blog/chia-and-…
English
0
2
13
5.5K
Matt Honea retweetledi
Troy Hunt
Troy Hunt@troyhunt·
10 years ago today, I started a pet project with a stupid name. Like all my previous projects, I expected it to scratch an itch and then fail miserably. But @haveibeenpwned didn't do that, not by a long shot. A decade later here we are! 🎂 troyhunt.com/a-decade-of-ha…
English
48
163
1.8K
116.9K
Matt Honea retweetledi
Lisa Su
Lisa Su@LisaSu·
Truly honored to attend the White House State dinner hosted by @POTUS in honor of Indian Prime Minister @narendramodi and the wonderful round table discussion this morning hosted by @SecRaimondo. Inspiring to see the friendship, optimism and enormous possibilities in tech across the U.S. and India. Looking forward to all that we will do together in building a strong semiconductor ecosystem.
Lisa Su tweet mediaLisa Su tweet mediaLisa Su tweet media
English
54
126
1.7K
188.4K
Matt Honea retweetledi
Troy Hunt
Troy Hunt@troyhunt·
This is interesting reading regarding the .zip TLD. However, it's of near zero consequence to phishing attacks, read it first then I'll explain: @bobbyrsec/the-dangers-of-googles-zip-tld-5e1e675e59a5" target="_blank" rel="nofollow noopener">medium.com/@bobbyrsec/the…
English
35
168
923
347.6K