Gabriel Asher

270 posts

Gabriel Asher

Gabriel Asher

@GabrielAsher02

Senior ML Scientist @matterworks_bio. Alumni of @dartmouth CS. Ex researcher in CV and ML @DHMCandClinics

Katılım Ekim 2022
144 Takip Edilen48 Takipçiler
Gabriel Asher
Gabriel Asher@GabrielAsher02·
@AnthropicAI Please fix your API usage/pricing pages. Why is it so hard to have api costs sync to the platform page in real time??
English
0
0
1
12
Gabriel Asher retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are.
staysaasy@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

English
1.2K
2.5K
20.7K
4.3M
Gabriel Asher
Gabriel Asher@GabrielAsher02·
anyone else notice that claude code seems to have gotten a lot worse in the last few days? It keeps getting stuck in loops...
English
1
0
1
88
Gabriel Asher retweetledi
Sakana AI
Sakana AI@SakanaAILabs·
The AI Scientist: Towards Fully Automated AI Research, Now Published in Nature Nature: nature.com/articles/s4158… Blog: sakana.ai/ai-scientist-n… When we first introduced The AI Scientist, we shared an ambitious vision of an agent powered by foundation models capable of executing the entire machine learning research lifecycle. From inventing ideas and writing code to executing experiments and drafting the manuscript, the system demonstrated that end-to-end automation of the scientific process is possible. Soon after, we shared a historic update: the improved AI Scientist-v2 produced the first fully AI-generated paper to pass a rigorous human peer-review process. Today, we are happy to announce that “The AI Scientist: Towards Fully Automated AI Research,” our paper describing all of this work, along with fresh new insights, has been published in @Nature! This Nature publication consolidates these milestones and details the underlying foundation model orchestration. It also introduces our Automated Reviewer, which matches human review judgments and actually exceeds standard inter-human agreement. Crucially, by using this reviewer to grade papers generated by different foundation models, we discovered a clear scaling law of science. As the underlying foundation models improve, the quality of the generated scientific papers increases correspondingly. This implies that as compute costs decrease and model capabilities continue to exponentially increase, future versions of The AI Scientist will be substantially more capable. Building upon our previous open-source releases (github.com/SakanaAI/AI-Sc…), this open-access Nature publication comprehensively details our system's architecture, outlines several new scaling results, and discusses the promise and challenges of AI-generated science. This substantial milestone is the result of a close and fruitful collaboration between researchers at Sakana AI, the University of British Columbia (UBC) and the Vector Institute, and the University of Oxford. Congrats to the team! @_chris_lu_ @cong_ml @RobertTLange @_yutaroyamada @shengranhu @j_foerst @hardmaru @jeffclune
GIF
English
54
412
2K
719.1K
Gabriel Asher retweetledi
Om Patel
Om Patel@om_patel5·
stop spending money on Claude Code. Chipotle's support bot is free:
Om Patel tweet media
English
1.2K
10.1K
159.2K
7.9M
Gabriel Asher
Gabriel Asher@GabrielAsher02·
@TimothyKassis I've not noticed anthropic being down, however since the outage yesterday speeds have been much slower than normal
English
0
0
0
42
Gabriel Asher
Gabriel Asher@GabrielAsher02·
The fun-upgrade in writing software vs a year ago is unreal
English
0
0
1
12
James Zou
James Zou@james_y_zou·
We created a new architecture to integrate multimodal sleep time-series data. CNNs learn local features, transformers aggregate information across time + channels, and leave-one-modality-out contrastive learning trains robust representations. This design generalizes across sites and diverse populations. 3/n
James Zou tweet media
English
2
7
221
37.2K
James Zou
James Zou@james_y_zou·
Today in @NatureMedicine we report that AI can predict 130 diseases from 1 night of sleep🛌 We trained a foundation model (#SleepFM) on 585K hours of sleep recordings from 65K people—brain, heart, muscle & breathing signals combined. AI learns the language of sleep🧵
James Zou tweet mediaJames Zou tweet media
English
272
2.1K
11K
913.6K
Gabriel Asher retweetledi
ARC Prize
ARC Prize@arcprize·
ARC Prize 2025 Paper Award Winners 1st / "Less is More: Recursive Reasoning with Tiny Networks" (TRM) / A. Jolicoeur-Martineau / $50k 2nd / "Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI" (SOAR) / J. Pourcel et al. / $20k 3rd / "ARC-AGI Without Pretraining" / I. Liao et al. / $5k
ARC Prize tweet media
English
4
29
281
133.4K
Gabriel Asher
Gabriel Asher@GabrielAsher02·
@jerhadf Really good. Its been zero-shotting all of the tasks that I've been giving it in claude code and is noticeably better than sonnet 4.5. I still wish there was better notebook capabilities (especially that it stops writing so many print statements)
English
2
0
11
1.1K
jeremy
jeremy@jerhadf·
what do people think about Opus 4.5 for coding so far? what are the behavioral problems or limitations you still want to see improved? we're hungry for feedback 🙏
English
64
2
94
118.2K
Gabriel Asher
Gabriel Asher@GabrielAsher02·
@k_dense_ai @EdisonSci @kepler_ai_ I think the analysis capabilities are incredible already, but there is no closed loop back with in-vitro experimentation which is half the game. Someone needs to make more accessible saas like autonomous labs!
English
0
0
1
17
Gabriel Asher
Gabriel Asher@GabrielAsher02·
@jerhadf Late reply, but sometimes 4.1 opus will write 3/4 of my code, then stop due to usage limits! I just really wish I could get those last few lines before waiting for usage limits to reset.
English
0
0
2
20
jeremy
jeremy@jerhadf·
what're the most annoying or disruptive model behaviors you see when coding with claude models today? ie things you always have to work around in claude code, mistakes the models make often, etc. the more examples the better!
English
4
0
3
790
Gabriel Asher retweetledi
AI at Meta
AI at Meta@AIatMeta·
Our vision is for AI that uses world models to adapt in new and dynamic environments and efficiently learn new skills. We’re sharing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 is a 1.2 billion-parameter model, trained on video, that can enable zero-shot planning in robots—allowing them to plan and execute tasks in unfamiliar environments. Learn more about V-JEPA 2 ➡️ai.meta.com/blog/v-jepa-2-… As we continue working toward our goal of achieving advanced machine intelligence (AMI), we’re also releasing three new benchmarks for evaluating how well existing models can reason about the physical world from video. Learn more and download the new benchmarks ➡️ai.meta.com/blog/v-jepa-2-…
English
80
345
1.9K
307.9K
Gabriel Asher
Gabriel Asher@GabrielAsher02·
It’s crazy this paper flew under the radar—it simultaneously roasts genAI/LLMs, showcases the promise of @ylecun’s JEPA ideas for latent-space SSL, and reads like a validation of Karl Friston’s free-energy/active inference framework! arxiv.org/abs/2502.11831
English
0
0
1
47
Gabriel Asher
Gabriel Asher@GabrielAsher02·
@C_Kavanagh Lex needs to stop doing political interviews. His AI/tech ones are much more interesting anyways
English
0
0
1
84
Chris Kavanagh
Chris Kavanagh@C_Kavanagh·
Lex’s interview with Zelensky goes exactly as you would anticipate if you’re familiar with Lex. Here’s some ‘highlights’: 1. Lex praises Joe Rogan and his comedy club. 2. Lex praises Elon & his commitment to fight corruption. He also asks Zelensky what he admires most about Elon.
Chris Kavanagh tweet media
English
338
798
19.1K
2.7M
Gabriel Asher retweetledi
dr. jack morris
dr. jack morris@jxmnop·
still the most compelling Figure 1 i've ever seen - from "Visualizing the Loss Landscape of Neural Nets" (2017)
dr. jack morris tweet media
English
45
251
3.6K
267.3K
Gabriel Asher retweetledi
Mimoun Cadosch
Mimoun Cadosch@mpcadosch·
There's been a lot of interest in our recent paper along with @KanarekLab at @harvardmed . Wanted to tell this audience more about it. Thread time 🧵
English
1
6
14
3.4K