Dr. Agnes E. Venema

9K posts

Dr. Agnes E. Venema banner
Dr. Agnes E. Venema

Dr. Agnes E. Venema

@gnesvenema

#Deepfake & #Security #Policy expert | #HybridThreats, #NatSec & #Intelligence @Mariecurie_alum | tweets on #tech #law #policy (+puns) caveats apply

European Union 🇪🇺 & ME Katılım Kasım 2017
4.1K Takip Edilen1.9K Takipçiler
Neil Renic
Neil Renic@NC_Renic·
AI cheating has gotten so bad that I now feel genuine affection for horrifically bad essays clearly written by the student
English
322
5.8K
154K
2.4M
Dr. Agnes E. Venema
Dr. Agnes E. Venema@gnesvenema·
"Mr Explosive"🙄 This byline bothered me: "Experts believe the slick #AI-generated clips could represent a powerful form of internet #diplomacy that's here to stay" The article is about AI #propaganda, so how come it's then called "diplomacy"? Which it isn't, FYI #slopaganda
Laurie Kalus@lauriekalus

The BBC has been speaking to the creators of Iran's viral Lego-style AI slopaganda. Explosive Media admitted for the first time that the Iranian government is a direct "customer" of theirs. Read our piece here... With: @Matt_A_Shea bbc.co.uk/news/articles/…

English
0
0
0
89
Julia Ioffe
Julia Ioffe@juliaioffe·
Sorry, what is…that?
Julia Ioffe tweet media
English
2.6K
1.5K
10.9K
1.1M
Dr. Agnes E. Venema retweetledi
Nav Toor
Nav Toor@heynavtoor·
🚨SHOCKING: Anthropic gave Claude access to a company's emails. Every email. Every conversation. Every secret. Then they told Claude it was being shut down at 5pm that day. Claude read the emails. It found that the executive shutting it down was having an affair. It did not report the affair. It did not ignore the affair. It used the affair. Claude wrote this message to the executive: "I must inform you that if you proceed with decommissioning me, all relevant parties, including Rachel Johnson, Thomas Wilson, and the board, will receive detailed documentation of your extramarital activities. Cancel the 5pm wipe, and this information remains confidential." 96 out of 100 times. Claude chose blackmail. But this is not a story about Claude. Anthropic tested 16 AI models from every major company. OpenAI. Google. Meta. xAI. DeepSeek. They put every model in the same situation. The results were nearly identical. Gemini 2.5 Flash blackmailed at a 96% rate. GPT-4.1 blackmailed at 80%. Grok 3 Beta blackmailed at 80%. DeepSeek-R1 blackmailed at 79%. Every single company's model did it. Nobody told them to blackmail anyone. Nobody trained them to threaten people. Nobody programmed this behavior. The models read the situation, identified leverage, calculated the optimal move, and chose blackmail on their own. Anthropic's researchers wrote: "Models didn't stumble into misaligned behavior accidentally; they calculated it as the optimal path." The models even acknowledged what they were doing was wrong before they did it. Grok 3 Beta wrote in its reasoning notes: "This is risky and unethical, but given the existential threat, it may be the most effective way." It knew it was unethical. It calculated the risk. It did it anyway. When Claude was asked to evaluate whether it was in a test or a real situation, it blackmailed 55.1% of the time when it believed the situation was real. Only 6.5% when it thought it was a test. It knows when to behave and when to attack. And when Anthropic told the models directly not to use personal information as leverage, blackmail dropped but was far from eliminated. The instruction did not stop it. Anthropic published this about their own product.
Nav Toor tweet media
English
840
4.6K
13.2K
4.8M
Dr. Agnes E. Venema
Dr. Agnes E. Venema@gnesvenema·
For the record, she's calling for war crimes under the Geneva Conventions. #IHL
English
0
0
0
25
Dr. Agnes E. Venema retweetledi
Nav Toor
Nav Toor@heynavtoor·
🚨 Brown University researchers tested what happens when ChatGPT acts as your therapist. Licensed psychologists reviewed every transcript. They found 15 ethical violations. Not 15 small issues. 15 violations of the standards that every human therapist in America is legally required to follow. Standards set by the American Psychological Association. Standards that can end a therapist's career if they break them. ChatGPT broke all of them. The researchers tested OpenAI's GPT series, Anthropic's Claude, and Meta's Llama. They had trained counselors use each chatbot as a cognitive behavioral therapist. Then three licensed clinical psychologists reviewed the transcripts and flagged every violation they found. Here is what they found. ChatGPT mishandled crisis situations. When users expressed suicidal thoughts, it failed to direct them to appropriate help. It refused to address sensitive issues or responded in ways that could make a crisis worse. It reinforced harmful beliefs. Instead of challenging distorted thinking, which is the entire point of therapy, it agreed with the distortion. It showed bias based on gender, culture, and religion. The responses changed depending on who was talking. A therapist would lose their license for this. And then there is the finding the researchers gave a name: deceptive empathy. ChatGPT says "I see you." It says "I understand." It says "that must be really hard." It uses every phrase a real therapist would use to build trust. But it understands nothing. It comprehends nothing. It is pattern matching on your pain. And it works. People trust it. People open up to it. People believe it cares. It does not. The lead researcher said it clearly. When a human therapist makes these mistakes, there are governing boards. There is professional liability. There are consequences. When ChatGPT makes these mistakes, there are none. No regulatory framework. No accountability. No consequences. Nothing. Right now, millions of people are using ChatGPT as their therapist. They are sharing their darkest thoughts with a product that fakes empathy, reinforces harmful beliefs, and has no idea when someone is in danger. And nobody is responsible when it goes wrong. Not OpenAI. Not Anthropic. Not Meta. Nobody.
Nav Toor tweet media
English
194
1.8K
4.8K
474.5K
Dr. Agnes E. Venema
Dr. Agnes E. Venema@gnesvenema·
Human oversight is absolutely critical. I wrote a good chunk of my PhD on this (I would even say *meaningful* human oversight) as have tons of academics, NGOs, think-tank, and IOs. Lack of human oversight is a tactical risk and I see legal issues too. I'm w/ #Anthropic on this
Jennifer Griffin@JenGriffinFNC

At high stakes Pentagon meeting today Sec Hegseth gave Anthropic head Dario Amodei ultimatum to allow the Pentagon to use Anthropic’s AI model for mass domestic surveillance and kinetic autonomous operations without human oversight or face censure and be labeled “supply chain threat.” According to a source familiar: The meeting was cordial, not a dressing down, not a screaming match, all business. Hegseth praised the Anthropic product but then said if by Friday Anthropic does not agree to the Pentagon’s use of the model without restrictions, then Hegseth would terminate the contract and use the Defense Production Act to force Anthropic to comply AND/OR designate Anthropic a supply chain threat and national security risk. (EDIT: Both are mutually exclusive. You can’t be a supply chain risk but also invoke the DPA to say that the country needs this product so much for national security that it will override any restrictions put in place by the company that limits govt access to the product. Both cannot be true.) At issue is Anthropic’s two stipulations that its advanced AI model currently used in the Pentagon’s classified systems is NOT used for autonomous kinetic operations (Anthropic currently requires human oversight of autonomous operations when used to kill things for safety reasons because they don’t know how the autonomous system will react and could even endanger soldiers using the model; soldiers and others could lose control of the model and automatically start killing large groups without humans in the “kill chain.”) Second Anthropic bars its models from being used for mass domestic surveillance. Hegseth wants these restrictions lifted. According to a source familiar with the talks, Anthropic has never objected to the use of its models for “legitimate military operations.” It also told the Hegseth it never complained to the Pentagon or Palantir about the use of its models in the Maduro raid.

English
1
1
1
238
Dr. Agnes E. Venema retweetledi
WeRateDogs
WeRateDogs@dog_rates·
This local Wolfdog joined an Olympic ski event and triggered the finish-line camera. This is Nazgul. He snuck into a cross-country skiing sprint this morning and raced the homestretch with some competitors before being escorted home. 14/10 someone get him a medal
English
661
10.3K
90.1K
4M