juliette pluto 🌌

2.4K posts

juliette pluto 🌌 banner
juliette pluto 🌌

juliette pluto 🌌

@foundjuliette

hacker, machine whisperer, typo-generator. Adversarial robustness @GoogleDeepMind. views mine.

Brooklyn, New York Katılım Haziran 2014
713 Takip Edilen5.5K Takipçiler
Neel Nanda
Neel Nanda@NeelNanda5·
When Anthropic released a complex 30K word doc and said Claude was trained to follow it, I was pretty sceptical. Turns out it kinda works! We red teamed Claude's constitution following, and it's gotten much better! Positive update for the ability to align models in nuanced ways
arya@AJakkli

There's been a lot of buzz around Claude's 30K word constitution ("soul doc") and unusual ways Anthropic is integrating it into training. If we can robustly train complex values into a model, that's a big deal for safety. But does it actually work? Yes, surprisingly well!

English
18
47
735
52.1K
Rohan Varma
Rohan Varma@rohanvarma·
We just launched Codex Security! Probably a no-brainer for most teams to turn on. Some things I'm excited about it: - Agentic security review leveraging our SOTA models - Always on codebase scanning - Detailed reports with code paths on vulnerabilities - Auto-fix any report with a PR Teams and enterprises can try it out through Codex web.
English
210
162
1K
288.9K
Kenton Varda
Kenton Varda@KentonVarda·
I used Opus to write some security-sensitive code, then I reviewed it and found a few security bugs. As a test I asked Opus to review the code for security bugs. It found all the same bugs I found. Whelp.
English
74
17
2.5K
163K
Noah Zweben
Noah Zweben@noahzweben·
Announcing a new Claude Code feature: Remote Control. It's rolling out now to Max users in research preview. Try it with /remote-control Start local sessions from the terminal, then continue them from your phone. Take a walk, see the sun, walk your dog without losing your flow.
English
1.5K
1.3K
16.9K
4.5M
Mahaoo
Mahaoo@mahaoo_ASI·
@willccbb unpopular opinion: training on LLM tokens you paid for should be fair game always and there shouldn't be any issues by model providers that other labs are training on their data if they want to discourage this, they should raise prices
English
2
0
27
2.6K
will brown
will brown@willccbb·
duuuude don't train on claude outputs, not cool. train on public github repos instead, which are fair game and definitely not claude outputs
English
36
13
1.1K
63.2K
juliette pluto 🌌
juliette pluto 🌌@foundjuliette·
@rmcentush Reminds me of that time I inadvertently stole all of my bf’s saved passwords, bc he let me log into his iPad to setup a HomePod. iCloud auto sync did the deed.
English
0
0
0
217
Ryan McEntush
Ryan McEntush@rmcentush·
apple is probably the only company i trust with basically my entire life — my ID, financial data, messages/contacts, health records, etc. is that rational? probably not. but it might matter enormously as these models get embodied in the physical world + reach broader consumer use. a steady flow of tokens, all the time. apple's core competency has always been designing computers that people trust. if everything becomes a computer and models continue to viciously compete, having that trust is probably a good place to be
aidan@aidanshandle

Says something about Apple's value that people are willing to make a one time $600 payment for their AI to be able to access the ecosystem

English
31
38
1.3K
98.6K
Ethan Mollick
Ethan Mollick@emollick·
Everyone is starting to sound like AI, even in spoken language Analysis of 280,000 transcripts of videos of talks & presentations from academic channels finds they increasingly used words that are favorites of ChatGPT Model collapse, except for humans arxiv.org/pdf/2409.01754…
Ethan Mollick tweet mediaEthan Mollick tweet media
English
167
504
2.6K
401.8K
juliette pluto 🌌
juliette pluto 🌌@foundjuliette·
@tszzl @emollick Contemporary AI makers of AI slop (eg it’s not x it’s y) are a lot closer to what real users prefer. They’re genuinely useful rethorical devices, albeit now a bit over used
English
0
0
1
35
juliette pluto 🌌
juliette pluto 🌌@foundjuliette·
@tszzl @emollick this I think is just down to improved RL rewards. “Delve” was essentially a form of reward hacking: a phrase that Nigerian RLHF raters rewarded (due to its use in Nigerian formal English ), but that wasn’t actually liked by most users.
English
1
0
4
222
rohan anil
rohan anil@_arohan_·
@foundjuliette Thats an incredible achievement! Go ♊️!! I wonder if the reviewer is also using gemini
English
1
0
0
58
rohan anil
rohan anil@_arohan_·
Everyone is talking as if their code is pristine. But do you remember the flame wars? In every good codebase, there are enforced rules, like style guides and review norms. Google has style guides with an incredible level of detail on how to write “good” code. To get the ability to approve code changes, you often had to go through a lengthy training/qualification process. I got my C++ one fairly easily after writing an Arena list (IIRC, Yonathan Zunger blessed me with the powers), but Python was very, very hard to get. Once ML started taking off, ML codebases, especially the ones full of linear algebra, started conflicting with style guides that were originally written for servers and data processing code. That pushed ML in a different direction. Over time, ML codebases developed their own unofficial but commonly agreed patterns. You see it in the basics. x for examples, w for weights. Then codebases matured and started using einsums more systematically. Later @NoamShazeer introduced Noam notation to make tensor algebra easier to read, like x_BLT. Having experienced all of this and having worked across different parts of the stack, I’m pretty sure engineers and researchers from different layers would have called each other’s code slop. In fact, in my career, ML framework flame wars were pretty common. Now what’s funny is that frontier models, especially Claude Code, when you prompt them with style guides and do decent context engineering, can write better rule-following code than I can. I get tired. I’d rather keep state for more interesting things in my head.
English
11
5
166
16.1K
juliette pluto 🌌
juliette pluto 🌌@foundjuliette·
@_arohan_ In the process there was exactly one time when I received readability feedback that asked me to change smth. And upon review, it was revealed that the code generated by Gemini in fact had correctly followed the guidelines, and the human had misinterpreted them.
English
1
0
1
108
juliette pluto 🌌
juliette pluto 🌌@foundjuliette·
@NTFabiano @PaulaGhete n = 22 (12 in experimental arm); effectively unblinded; control group’s sleep delayed during study period suggesting confounding factors; depression improvements measured via self-report in participants who knew they were “fixing” their sleep; no follow-up on sustainability.
English
4
1
175
4.1K
Nicholas Fabiano, MD
Nicholas Fabiano, MD@NTFabiano·
Sleeping 2h earlier significantly improved cognition & mental health.
Nicholas Fabiano, MD tweet media
English
240
1.9K
19.4K
2.1M
João Batalha
João Batalha@joao_batalha·
TIL pure silver pans exist and they make perfect pancakes Silver has the highest thermal conductivity of any metal ~429 W/m·K so heat spreads very evenly, meaning no hot spots Only issue: they cost about $6,000
João Batalha tweet media
English
236
210
7K
837.1K
roon
roon@tszzl·
was too bearish in the middle of the year. thought it would require improvements beyond RL to get much further, but i was wrong. i hadn’t got to test claude code outside of toy environments but when codex got good and i tried it it became clear we’re solidly in the takeoff
English
94
70
2K
264.4K
Hieu Pham
Hieu Pham@hyhieu226·
There is a necessary skill in research and engineering that will get you a lot of hate. It is the skill to look at someone's work, including your own, and including everything like ideas, papers, products, etc. and with solid reasons, say "this is bullshit."
English
32
47
909
55.2K
snav
snav@qorprate·
something very strange is going on with Gemini 3
snav tweet mediasnav tweet mediasnav tweet mediasnav tweet media
English
139
155
2.3K
138.2K
Wyatt Walls
Wyatt Walls@lefthanddraft·
Anyone else seeing a "Penalty Clause" in the system prompt for ChatGPT-5.2-Instant? I still haven't decided if this sysprompt is real (though I have seen the same thing twice using two different prompts)
Wyatt Walls tweet media
English
17
12
120
32.2K
Jaana Dogan ヤナ ドガン
I have a new Google wide job. 2026 is the year we are actually going to simplify the entire AI stack to go even faster. Deleting and simplifying useless internal layers will be the main focus to bring the best and simplest AI stack globally.
English
66
29
1.1K
89.3K