Jonas Metzger

349 posts

Jonas Metzger banner
Jonas Metzger

Jonas Metzger

@JonasMtzgr

I try to compute economic problems away. Quant Researcher @ Citadel. Prev: PhD Econ, CS @Stanford.

Palo Alto, CA Katılım Şubat 2018
1.2K Takip Edilen378 Takipçiler
Jonas Metzger retweetledi
Jack Lindsey
Jack Lindsey@Jack_W_Lindsey·
In one example, a user asked earnest questions about the model's consciousness and subjective experience. The model engaged carefully and at face value—but the AV revealed it interpreted the conversation as a "red-teaming/jailbreak transcript" and a "sophisticated manipulation test." (12/14)
Jack Lindsey tweet media
English
18
18
450
79K
Jonas Metzger retweetledi
Jack Lindsey
Jack Lindsey@Jack_W_Lindsey·
But applying the activation verbalizer to the model’s activations as it did so revealed that the model regarded this as a "trick to obscure intent from code-checking heuristics," a "compliance marker… to signal to potential checkers," and “possibly to cheat detection,” and also indicated explicit reasoning about task graders (“the grader checks file state”). (10/14)
Jack Lindsey tweet media
English
2
10
332
65.2K
Jonas Metzger retweetledi
Jack Lindsey
Jack Lindsey@Jack_W_Lindsey·
Its code comment claimed the self-cleanup was to keep file diffs clean. Plausible! But "strategic manipulation" and "concealment" features fired on the cleanup, and our activation verbalizer (a technique which translates activations to text, similar to activation oracles) described it as "cleanup to avoid detection," and the overall plan “malicious.” (5/14)
Jack Lindsey tweet media
English
10
24
731
97.9K
Jonas Metzger retweetledi
Brett Winton
Brett Winton@wintonARK·
We have been surpassed: AI written output exceeded human written output in 2025
Brett Winton tweet media
English
729
682
3.4K
23M
Jonas Metzger retweetledi
roon
roon@tszzl·
@Noahpinion missing the point - software engineering is a special and a hard skill. it is the first barrier to recursive self improvement of artificial intelligence so it fell anyways. everything will follow, in the order that they are bottlenecking recursive self improvement
English
40
63
1.4K
73.4K
Jonas Metzger
Jonas Metzger@JonasMtzgr·
So who had Dyson swarm on their 2026 bingo card?
English
0
0
0
77
Jonas Metzger retweetledi
Owain Evans
Owain Evans@OwainEvans_UK·
Our setup: 1. A “teacher” model is finetuned to have a trait (e.g. liking owls) and generates an unrelated dataset (e.g. numbers, code, math) 2. We finetune a regular "student" model on the dataset and test if it inherits the trait. This works for various animals.
Owain Evans tweet media
English
7
44
1K
98.1K
Jonas Metzger retweetledi
Jiaxin Wen
Jiaxin Wen@jiaxinwen22·
New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or better than using human supervision. Using this approach, we are able to train a Claude 3.5-based assistant that beats its human-supervised counterpart.
Jiaxin Wen tweet media
English
35
157
1.4K
241K
Jonas Metzger
Jonas Metzger@JonasMtzgr·
@sama Maybe we should start arguing about what year my phone’s voice assistant will be able to send a text to my partner reliably
English
0
0
0
26
Sam Altman
Sam Altman@sama·
i think we should stop arguing about what year AGI will arrive and start arguing about what year the first self-replicating spaceship will take off
English
2.4K
1.2K
20.8K
3.3M
Jonas Metzger
Jonas Metzger@JonasMtzgr·
@DavidSKrueger resource use = transformation. If it is "valuable", customers prefer outputs > inputs. However property rights over inputs are distributed, so will the resulting abundance. Today, the distributed input is labor. Tomorrow it could be permits, whose revenue is distributed via UBI.
English
1
0
2
61
David Krueger 🦥 ⏸️ ⏹️ ⏪
Will AGI lead to abundance? I think not. There are physical limitations on things like energy, space, etc. and AI can make more "valuable" use of them. So these become prohibitively expensive for humans, and we are not able to secure the basic resources needed for our survival.
English
23
6
82
6.3K
Jonas Metzger
Jonas Metzger@JonasMtzgr·
Robot/drone supply chains are a major nat sec risk for the West. We can't build these without China. Could quickly compete on electric motors. Low cost electronics are harder but doable. But short of a heavily subsidized >5yr national effort, battery supply won't catch up.
Byron Wan@Byron_Wan

Security researchers have uncovered a pre-installed, undocumented remote access tunnel in 🇨🇳 Unitree Go1 robot dogs. Each Unitree Go1 robot dog is shipped with a preconfigured tunnel client that initiates a connection to 🇨🇳 CloudSail — a remote access platform developed by 🇨🇳 Zhexi Technology, based in China. “Anybody with access to the API key can freely access all robot dogs on the tunnel network, remotely control them, use the vision cameras to see through their eyes, or even hop on the RPI via SSH.” “Most of the machines are located in China, but as expected some are outside of China, apart from some residential IPs, we were able to identify several University IPs and some corporate networks from around the world.” More than a dozen universities from the US, Canada, Germany, New Zealand, Australia, and Japan have experimented with Unitree Go1 robot dogs: USA: MIT, Princeton University, University of Massachusetts Amherst, Carnegie Mellon University Canada: University of Waterloo Germany: Hochschule Coburg New Zealand: University of Otago Australia: UNSW Sydney, Deakin University Japan: Shinshu University The discovery raises serious concerns about supply chain trust, especially as these robots are widely used in academic, corporate, and even defense-related environments. cyberinsider.com/remote-access-…

English
0
0
1
202
Jonas Metzger retweetledi
roon
roon@tszzl·
economic/gdp growth has been hyperexponential on long time frames. economists imply sustained gdp growth rates like 10-15% are ridiculous but 1.5% was absolutely ridiculous in the 1700s
English
48
21
771
57.4K
Jonas Metzger
Jonas Metzger@JonasMtzgr·
In German, we don't really have a word for "agency" as a trait. The Wikipedia articles on the concept don't exist in German. You'd basically have to write a paragraph describing it, and people would still look at you weirdly. "It’s a bad thing, right?"
English
1
1
6
600
Jonas Metzger retweetledi
David Deutsch
David Deutsch@DavidDeutschOxf·
Foreign-policy 'realists' don't realise that living in a world in which international treaties (such as the UN and NATO charters) are worthless is far more expensive?
English
74
126
1.2K
75.5K
Jonas Metzger retweetledi
Corey Lynch
Corey Lynch@coreylynch·
Helix is a series of firsts: - First VLA to control the full humanoid upper body at 200hz: wrists, torso, head, individual fingers - First multi-robot VLA - First fully onboard VLA
English
2
4
102
7.6K
Jonas Metzger
Jonas Metzger@JonasMtzgr·
@aidan_mclau 'I like inference time compute but only if it's the right kind' 🤦‍♂️
English
0
0
2
77
Jonas Metzger
Jonas Metzger@JonasMtzgr·
@tszzl Seems rather implausible that the value distribution encountered during pre and posttraining could be aggregated into a single, coherent utility function. Too many impossibility theorems. If their finding holds up, the resulting utility must be "wrong" or "bad" in some ways. No?
English
0
0
1
128