Danny Hernandez

479 posts

Danny Hernandez

Danny Hernandez

@Hernandez_Danny

Measuring and forecasting AI progress @AnthropicAI.

San Francisco, CA Katılım Mart 2011
531 Takip Edilen4.5K Takipçiler
Sabitlenmiş Tweet
Danny Hernandez
Danny Hernandez@Hernandez_Danny·
AI systems often use more direct experience than a human could get in a lifetime. Humans require less experience, because they "transfer" past experience to new tasks. Recent work I led found an equation to characterize transfer in a simple setting. arxiv.org/pdf/2102.01293… 👇
Danny Hernandez tweet media
English
8
41
217
0
Danny Hernandez retweetledi
Kanjun 🐙
Kanjun 🐙@kanjun·
Today, AI can generate tons of code—but how do we know if it's good? That's why we built Sculptor: the first coding agent environment. Sculptor helps you catch issues, write tests, and improve your code—all while you work in your favorite editor.
English
65
89
537
136.9K
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
Introducing Claude for Enterprise. Now your entire organization can collaborate securely with Claude—with no training on chats or files. Comes with: 📚 Expanded 500K context window 🧑‍💻 Native GitHub integration 🔐 Enterprise-grade security features anthropic.com/news/claude-fo…
English
94
223
1.6K
290.8K
Danny Hernandez retweetledi
Sam Bowman
Sam Bowman@sleepinyourhat·
A big part of my job these days is to think about what technical work Anthropic needs to do to make things go well with the development of very powerful AI. I digested my thinking on this, plus some of the Anthropic zeitgeist around it, into this piece: sleepinyourhat.github.io/checklist/
Sam Bowman tweet media
English
11
58
451
70.2K
Danny Hernandez retweetledi
Jan Leike
Jan Leike@janleike·
I'm excited to join @AnthropicAI to continue the superalignment mission! My new team will work on scalable oversight, weak-to-strong generalization, and automated alignment research. If you're interested in joining, my dms are open.
English
399
485
8.4K
1.4M
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
New Anthropic research paper: Scaling Monosemanticity. The first ever detailed look inside a leading large language model. Read the blog post here: anthropic.com/research/mappi…
Anthropic tweet media
English
65
535
2.3K
754.2K
Danny Hernandez retweetledi
tylercowen
tylercowen@tylercowen·
The word hasn't gotten out yet just how good Claude 3 Opus is for economics and economic reasoning. So here's the word.
English
26
79
1.2K
358.3K
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through. arxiv.org/abs/2401.05566
Anthropic tweet media
English
108
537
2.9K
1.8M
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
Our new model Claude 2.1 offers an industry-leading 200K token context window, a 2x decrease in hallucination rates, system prompts, tool use, and updated pricing. Claude 2.1 is available over API in our Console, and is powering our claude.ai chat experience.
English
387
835
4.5K
1.9M
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
Today, we’re publishing our Responsible Scaling Policy (RSP) – a series of technical and organizational protocols to help us manage the risks of developing increasingly capable AI systems.
Anthropic tweet media
English
22
140
591
236.5K
Danny Hernandez
Danny Hernandez@Hernandez_Danny·
We should expect a jump that in some sense feels like the jump between gpt3 and claude/gpt4 in the next ~2 years, based on smooth underlying exponentials in effective compute. There is lots of meaning in trying to make that go well and meaning is top of the hierarchy of needs.
Spencer Greenberg 🔍@SpencrGreenberg

How quickly is A.I. advancing? And should you be working in the field? Checkout my recent conversation on these topics with @Hernandez_Danny: podcast.clearerthinking.org/episode/172/da…

English
3
0
17
1.5K
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
Introducing Claude 2! Our latest model has improved performance in coding, math and reasoning. It can produce longer responses, and is available in a new public-facing beta website at claude.ai in the US and UK.
Anthropic tweet media
English
243
494
2.3K
849.7K
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
We develop a method to test global opinions represented in language models. We find the opinions represented by the models are most similar to those of the participants in USA, Canada, and some European countries. We also show the responses are steerable in separate experiments.
English
92
163
759
261.3K
Danny Hernandez retweetledi
Atmo
Atmo@atmo_ai·
We're introducing the first AI-based live global weather forecast. Available to everyone at earth.atmo.ai
English
5
30
86
24.6K
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
Introducing 100K Context Windows! We’ve expanded Claude’s context window to 100,000 tokens of text, corresponding to around 75K words. Submit hundreds of pages of materials for Claude to digest and analyze. Conversations with Claude can go on for hours or days.
English
205
982
5.1K
2.5M
Danny Hernandez retweetledi
Anthropic
Anthropic@AnthropicAI·
Proud to partner with @awscloud to give people an easy way to access Claude in their cloud environments! We believe this work will drive immense value for businesses looking to build generative AI applications with AWS tools and capabilities. Stay tuned for more.
Amazon Web Services@awscloud

Deep learning. Large language models. Vast capabilities. ☁️💻💡 From chatbots to code generation, learn how #GenerativeAI is redefining #ML-powered capabilities—& how you can build & use large language on #AWS. #MachineLearning #AI 👉 go.aws/416RimS

English
12
21
222
55.8K
Danny Hernandez
Danny Hernandez@Hernandez_Danny·
Only model I'm aware of comparable to ChatGPT. Have some notable customers. Some prefer its personality, clarity, summarization, creativity, etc. I Makes sense for any business using ChatGPT to compare it to Claude.
Anthropic@AnthropicAI

After working for the past few moths with key partners like @NotionHQ, @Quora, and @DuckDuckGo, we’ve been able to carefully test out our systems in the wild. We are now opening up access to Claude, our AI assistant, to power businesses at scale.

English
1
0
5
567
Danny Hernandez
Danny Hernandez@Hernandez_Danny·
Writing AI evals is the first AI researcher task I've seen where models feel clearly better than me. This automation will enable much better measurement and understanding of LM's broadly. Reduces eval iteration time from ~week to ~hour. >10x more leverage from a researcher.
Anthropic@AnthropicAI

It’s hard work to make evaluations for language models (LMs). We’ve developed an automated way to generate evaluations with LMs, significantly reducing the effort involved. We test LMs using >150 LM-written evaluations, uncovering novel LM behaviors. anthropic.com/model-written-…

English
0
2
13
2.8K