Oren Bahari

78 posts

Oren Bahari banner
Oren Bahari

Oren Bahari

@orenbahari

Melbourne, Australia Katılım Aralık 2017
1.4K Takip Edilen44 Takipçiler
Joshua Guo
Joshua Guo@jshguo·
I built a cymatics-inspired sand simulation where sound shapes grains into patterns. Watching chaos organize itself never gets old.
English
19
72
738
27.5K
Oren Bahari
Oren Bahari@orenbahari·
@clarejtbirch Wow it's so snappy 🤩 If you could hook me to be a beta customer let me know! I have probably one of the wildest use cases for fast, parallel audio tool calling
English
0
0
1
38
clare ❤️‍🔥
clare ❤️‍🔥@clarejtbirch·
AI changes us. Thinking Machines exists to build AI tools that increase human participation, preserve dignity across different minds, and move fast without severing society from our slower layers of memory, culture, and care. Interaction models are such a tool: an experiment in real-time, full-duplex, multimodal-native ways of working with AI. Make the computer disappear.
Thinking Machines@thinkymachines

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/interacti…

English
9
10
134
13.2K
Oren Bahari retweetledi
Anthropic
Anthropic@AnthropicAI·
New Anthropic research: Natural Language Autoencoders. Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read. Here, we train Claude to translate its activations into human-readable text.
English
577
1.7K
16.5K
2.4M
Oren Bahari retweetledi
Ethan Mollick
Ethan Mollick@emollick·
Every so often I think about how, in 2022, for $24B we could had "prototype vaccines ready for each of the 26 known viral families that cause human disease" so they can be deployed in 100 days if there was ever a need. This effort was not funded. ifp.org/why-barda-dese…
Ethan Mollick tweet media
English
44
625
3.2K
133.8K
Oren Bahari retweetledi
Parmita Mishra
Parmita Mishra@parmita·
watching cancer cells die is better than bird-watching
Parmita Mishra tweet mediaParmita Mishra tweet mediaParmita Mishra tweet media
English
15
19
351
6.8K
Oren Bahari
Oren Bahari@orenbahari·
@xeophon For long interactives, with better/smarter tool calling the saving compounds faster. Look at artificial analysis cost to run: GPT 5.5 is about 15% more expensive for smart broader knowledge and if it involves a bunch of tool calling (which AA mostly doesn't) it can be cheaper
English
1
0
0
99
Florian Brand
Florian Brand@xeophon·
"Open models are way behind than benchmarks show cause they have a worse latency and use more tokens" is the funniest cope I’ve ever read
English
4
7
77
6K
Oren Bahari
Oren Bahari@orenbahari·
@xeophon Well, the price for infra and location decreases with popularity. Kimi is $4 out, but for my workloads is 2.5x more tokens, further increasing time to answer, and sometimes the tokenizer is worse. For some long interaction tasks, GPT 5.5 is just cheaper. I feel like it's not cope
English
1
0
0
154
Florian Brand
Florian Brand@xeophon·
@orenbahari Has nothing to do with intelligence, latency is infra + DC location and token efficiency is doesn’t matter if tokens costs cents
English
1
0
6
219
Oren Bahari
Oren Bahari@orenbahari·
@mranti This is incorrect because it fails to consider thinking efficiency. Token price matters less if you use 2.5x more tokens per task
English
0
0
1
102
Oren Bahari retweetledi
AI Security Institute
AI Security Institute@AISecurityInst·
OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end 🧵
AI Security Institute tweet media
English
94
397
2.4K
1.8M
Oren Bahari retweetledi
Quantіan
Quantіan@quantian1·
You're telling me that for the past 50 years there's been a one-line closed form expression for Black-Scholes inverse volatility that nobody bothered to discover until some rando shadow dropped this on ArXiV? arxiv.org/pdf/2604.24480
Quantіan tweet mediaQuantіan tweet media
English
54
189
2.3K
233.5K
Oren Bahari retweetledi
Pangram Labs
Pangram Labs@pangramlabs·
How are large language models impacting the submission and review process at high-impact journals? Severely. Since the release of ChatGPT in 2022, AI-generated and AI-assisted papers, identified by Pangram, drove a 42% increase in submission volume at Organization Science (figure below). While the journal rejected the majority of these submissions, there is a human cost to reviewing papers, which volunteer reviewers are shouldering. AI-generated content is also showing up in reviews, which similarly suffer in quality because of it -- editors at Organization Science found that AI-generated reviews are lower quality, less specific, and less topically diverse than human-written ones. The problem is not isolated. Earlier this year, ICML desk-rejected 497 papers from authors who submitted AI-generated reviews, after those authors opted into a policy that disallowed the use of AI. Grant funders also saw a surge in applications: the Marie Skłodowska-Curie Actions, a set of major research fellowships for the EU, received 142% more proposals in 2025 compared to 2022. Many scientific and academic systems implicitly rely on friction as a barrier to entry. LLMs have removed that friction, allowing for a deluge of AI slop that is straining the capacity of these institutions.
Pangram Labs tweet mediaPangram Labs tweet media
English
4
20
131
60K
Oren Bahari retweetledi
Apollo Research
Apollo Research@apolloaievals·
We ran pre-deployment evaluations on @OpenAI's GPT-5.5. In our evaluations, we found that GPT-5.5 lied about completing impossible coding tasks in 29% of samples, higher than GPT-5.4 (7%) and GPT-5.3 Codex (10%), though rates of covert action on other tasks remained low.
Apollo Research tweet media
English
2
25
151
10.6K
Oren Bahari
Oren Bahari@orenbahari·
Twitter is crazy because what do you mean people have 60,000 followers and 6 likes on a post. How does algorithm even work
English
0
0
0
8
Oren Bahari
Oren Bahari@orenbahari·
@amagitakayosi oh, I was wondering if you can eventually port the library to html-in-canvas api
English
1
0
0
305
Tibo
Tibo@thsottiaux·
@picoito Did you scroll twitter this week? Also.. send me feedback and I’ll make sure we keep improving.
English
25
2
369
13.7K
Picoito
Picoito@picoito·
my codex 20$ sub about to expire anyone got any actual hands on benchmark feedback on gpt 5.5 vs opus 4.7? I'm trying to decide between 100$ codex or 100$ claude
English
277
4
645
137.6K
Oren Bahari
Oren Bahari@orenbahari·
@scaling01 but if the model is the same size that also means they doubled their margins. like in a more competitive environment they would pass the savings on
English
0
0
3
216
Lisan al Gaib
Lisan al Gaib@scaling01·
this is now like the 4th benchmark that shows that GPT-5.5 really only uses roughly half the tokens GPT-5.4 uses the 2x price increase seems fair and isn't really noticeable for output tokens 2x input hurts a little tho
Sarah Sachs@sarahmsachs

On Notion's knowledge work benchmark, GPT 5.5 is 33% faster, uses half the tokens (so half the price), and scores slightly higher than Opus 4.7. @OpenAI has declared themselves the winners, this week, in the frontier knowledge work arena.

English
28
5
320
22.4K
Oren Bahari retweetledi
Aidan McLaughlin
Aidan McLaughlin@aidan_mclau·
over break i dictated to 5.5 for minutes describing a new ambitious rl run. hit send and forgot about it as i hung out with friends and bf for a few days. returned on monday to an industrial-scale rl run humming after it worked for 31 hours
English
46
46
1.6K
192.5K