Danny Hernandez

479 posts

Danny Hernandez

@Hernandez_Danny

Measuring and forecasting AI progress @AnthropicAI.

San Francisco, CA Katılım Mart 2011

531 Takip Edilen4.5K Takipçiler

Sabitlenmiş Tweet

Danny Hernandez@Hernandez_Danny·3 Şub

AI systems often use more direct experience than a human could get in a lifetime. Humans require less experience, because they "transfer" past experience to new tasks. Recent work I led found an equation to characterize transfer in a simple setting. arxiv.org/pdf/2102.01293… 👇

English

217

Danny Hernandez retweetledi

Kanjun 🐙@kanjun·8 Nis

Today, AI can generate tons of code—but how do we know if it's good? That's why we built Sculptor: the first coding agent environment. Sculptor helps you catch issues, write tests, and improve your code—all while you work in your favorite editor.

English

537

136.9K

Danny Hernandez retweetledi

Dario Amodei@DarioAmodei·11 Eki

Machines of Loving Grace: my essay on how AI could transform the world for the better darioamodei.com/machines-of-lo…

English

1.2K

5.4K

2.5M

Danny Hernandez retweetledi

Anthropic@AnthropicAI·4 Eyl

Introducing Claude for Enterprise. Now your entire organization can collaborate securely with Claude—with no training on chats or files. Comes with: 📚 Expanded 500K context window 🧑‍💻 Native GitHub integration 🔐 Enterprise-grade security features anthropic.com/news/claude-fo…

English

223

1.6K

290.8K

Danny Hernandez retweetledi

Sam Bowman@sleepinyourhat·3 Eyl

A big part of my job these days is to think about what technical work Anthropic needs to do to make things go well with the development of very powerful AI. I digested my thinking on this, plus some of the Anthropic zeitgeist around it, into this piece: sleepinyourhat.github.io/checklist/

English

451

70.2K

Danny Hernandez retweetledi

Jan Leike@janleike·28 May

I'm excited to join @AnthropicAI to continue the superalignment mission! My new team will work on scalable oversight, weak-to-strong generalization, and automated alignment research. If you're interested in joining, my dms are open.

English

399

485

8.4K

1.4M

Danny Hernandez retweetledi

Anthropic@AnthropicAI·21 May

New Anthropic research paper: Scaling Monosemanticity. The first ever detailed look inside a leading large language model. Read the blog post here: anthropic.com/research/mappi…

English

535

2.3K

754.2K

Danny Hernandez retweetledi

tylercowen@tylercowen·10 Mar

The word hasn't gotten out yet just how good Claude 3 Opus is for economics and economic reasoning. So here's the word.

English

1.2K

358.3K

Danny Hernandez@Hernandez_Danny·5 Mar

I find our latest model to have a lot more nuanced and careful thinking. Feels like a big step towards the thought partner I want.

Anthropic@AnthropicAI

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

English

Danny Hernandez retweetledi

Anthropic@AnthropicAI·12 Oca

New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through. arxiv.org/abs/2401.05566

English

108

537

2.9K

1.8M

Danny Hernandez retweetledi

Anthropic@AnthropicAI·21 Kas

Our new model Claude 2.1 offers an industry-leading 200K token context window, a 2x decrease in hallucination rates, system prompts, tool use, and updated pricing. Claude 2.1 is available over API in our Console, and is powering our claude.ai chat experience.

English

387

835

4.5K

1.9M

Danny Hernandez retweetledi

Anthropic@AnthropicAI·19 Eyl

Today, we’re publishing our Responsible Scaling Policy (RSP) – a series of technical and organizational protocols to help us manage the risks of developing increasingly capable AI systems.

English

140

591

236.5K

Danny Hernandez@Hernandez_Danny·26 Ağu

We should expect a jump that in some sense feels like the jump between gpt3 and claude/gpt4 in the next ~2 years, based on smooth underlying exponentials in effective compute. There is lots of meaning in trying to make that go well and meaning is top of the hierarchy of needs.

Spencer Greenberg 🔍@SpencrGreenberg

How quickly is A.I. advancing? And should you be working in the field? Checkout my recent conversation on these topics with @Hernandez_Danny: podcast.clearerthinking.org/episode/172/da…

English

1.5K

Danny Hernandez retweetledi

Anthropic@AnthropicAI·11 Tem

Introducing Claude 2! Our latest model has improved performance in coding, math and reasoning. It can produce longer responses, and is available in a new public-facing beta website at claude.ai in the US and UK.

English

243

494

2.3K

849.7K

Danny Hernandez retweetledi

Anthropic@AnthropicAI·29 Haz

We develop a method to test global opinions represented in language models. We find the opinions represented by the models are most similar to those of the participants in USA, Canada, and some European countries. We also show the responses are steerable in separate experiments.

English

163

759

261.3K

Danny Hernandez retweetledi

Atmo@atmo_ai·23 May

We're introducing the first AI-based live global weather forecast. Available to everyone at earth.atmo.ai

English

24.6K

Danny Hernandez retweetledi

Anthropic@AnthropicAI·11 May

Introducing 100K Context Windows! We’ve expanded Claude’s context window to 100,000 tokens of text, corresponding to around 75K words. Submit hundreds of pages of materials for Claude to digest and analyze. Conversations with Claude can go on for hours or days.

English

205

982

5.1K

2.5M

Danny Hernandez retweetledi

Anthropic@AnthropicAI·13 Nis

Proud to partner with @awscloud to give people an easy way to access Claude in their cloud environments! We believe this work will drive immense value for businesses looking to build generative AI applications with AWS tools and capabilities. Stay tuned for more.

Amazon Web Services@awscloud

Deep learning. Large language models. Vast capabilities. ☁️💻💡 From chatbots to code generation, learn how #GenerativeAI is redefining #ML-powered capabilities—& how you can build & use large language on #AWS. #MachineLearning #AI 👉 go.aws/416RimS

English

222

55.8K

Danny Hernandez@Hernandez_Danny·30 Mar

General access to Claude via slack.

Anthropic@AnthropicAI

Today we are releasing the new Claude App for @SlackHQ, in beta. Now every company in the world has the chance to have a “virtual teammate” who can help make work more fun and productive.

English

652

Danny Hernandez@Hernandez_Danny·14 Mar

Only model I'm aware of comparable to ChatGPT. Have some notable customers. Some prefer its personality, clarity, summarization, creativity, etc. I Makes sense for any business using ChatGPT to compare it to Claude.

Anthropic@AnthropicAI

After working for the past few moths with key partners like @NotionHQ, @Quora, and @DuckDuckGo, we’ve been able to carefully test out our systems in the wild. We are now opening up access to Claude, our AI assistant, to power businesses at scale.

English

567

Danny Hernandez@Hernandez_Danny·19 Ara

Writing AI evals is the first AI researcher task I've seen where models feel clearly better than me. This automation will enable much better measurement and understanding of LM's broadly. Reduces eval iteration time from ~week to ~hour. >10x more leverage from a researcher.

Anthropic@AnthropicAI

It’s hard work to make evaluations for language models (LMs). We’ve developed an automated way to generate evaluations with LMs, significantly reducing the effort involved. We test LMs using >150 LM-written evaluations, uncovering novel LM behaviors. anthropic.com/model-written-…

English

2.8K

Keşfet

@AnthropicAI @awscloud @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA