Ai2

3.7K posts

Ai2 banner
Ai2

Ai2

@allen_ai

Breakthrough AI to solve the world's biggest problems. › Join us: https://t.co/MjUpZpKPXJ › Newsletter: https://t.co/k9gGznstwj

Seattle, WA Katılım Eylül 2015
435 Takip Edilen84.3K Takipçiler
Sabitlenmiş Tweet
Ai2
Ai2@allen_ai·
Today we’re bringing new NSF OMAI compute online with NVIDIA Blackwell Ultra-powered systems, turning a $152M national investment from @NSF & @NVIDIA into a foundation for truly open AI research. 🧵
Ai2 tweet mediaAi2 tweet mediaAi2 tweet media
English
6
22
127
286.4K
Ai2 retweetledi
Sewon Min
Sewon Min@sewon__min·
Really amazing results analyzing what's creative/novel vs. what's copied from Internet data, enabled by the amazing @liujc1998's Infini-gram! infini-gram.io This is also enabled in @allen_ai's OlmoTrace allenai.org/blog/olmotrace where anyone can find matching n-grams between LLM-generated text and its training data.
English
1
10
79
15K
Ai2 retweetledi
David Albright
David Albright@dalbright·
Very cool to see these conversations happening! This is what openness enables. The "tool that allows you to trace those n-grams directly to their source," is infinigram, AKA OlmoTrace from @allen_ai, created by @liujc1998. x.com/alexolegimas/s…
English
2
9
34
9.5K
Ai2
Ai2@allen_ai·
We're releasing a dataset of 14K HuggingFace models, datasets, papers, & codebases linked by 51K evaluations, fine-tunings, & references, plus the ArtifactLinker code. We hope it helps others find SOTA eval results. 💻 Code: github.com/allenai/artifa… 📊 Data: huggingface.co/datasets/lwaek…
English
0
0
1
1.1K
Ai2
Ai2@allen_ai·
@huggingface Using ArtifactLinker, we found cases where a strong model had never been evaluated on a benchmark it would set – or near-match – the SOTA on. We also found that newer LLMs like Gemma often lose to older DeBERTa models on natural language inference tasks.
English
1
0
0
1.1K
Ai2
Ai2@allen_ai·
Most models are only evaluated on a fraction of the benchmarks out there. ArtifactLinker, our new system, predicts which ones would set a new state-of-the-art on benchmarks hosted on @HuggingFace, then runs the evaluation to verify. 🧵
Ai2 tweet media
English
5
11
71
23.7K
Ai2
Ai2@allen_ai·
We release open models so they're available to builders working on problems that matter to them. On Global Accessibility Awareness Day, PointCheck is a fitting example. Read more ↓ allenai.org/blog/global-ac…
English
0
1
2
1.2K
Ai2
Ai2@allen_ai·
PointCheck tabs through a page like a keyboard user and screenshots each step. Molmo locates the focused element; MolmoWeb and Olmo 3 handle the rest. Brendan chose open models for his experimental side project so teams can self-host—no files leave the environment.
Ai2 tweet media
English
1
0
3
868
Ai2
Ai2@allen_ai·
Brendan Works is a product manager focused on paratransit services in Seattle. See how he built PointCheck, a website accessibility checker powered by our open Molmo, MolmoWeb, & Olmo 3 models. 👇
Ai2 tweet media
English
3
4
15
4.5K
Ai2
Ai2@allen_ai·
Available now in the same sizes as v1: Nano, Tiny, Base. Open weights, open training code. If you're running v1 and v1.1 works for your task, expect significant speedups during fine-tuning & inference. 🤗 Models: huggingface.co/collections/al… 🔗 Blog: allenai.org/blog/olmoearth…
English
0
1
14
1.8K
Ai2
Ai2@allen_ai·
A useful property for researchers: we held the pretraining dataset constant from v1. The differences cleanly isolate the methodological change, not the data or the architecture family.
English
1
0
6
1.9K
Ai2
Ai2@allen_ai·
Today we’re releasing OlmoEarth v1.1. It’s 3x cheaper to run than v1 while delivering the same state-of-the-art performance—and fully open. 🧵
Ai2 tweet media
English
8
39
345
26.3K
Ai2
Ai2@allen_ai·
@GoogleResearch @nvidia We’re releasing the first-phase AIMIP dataset + our analysis of it. We hope to continue AIMIP with future phases that expand its scope & scale. 📘 Learn more in our blog: allenai.org/blog/AIMIP 📊 Paper: arxiv.org/abs/2605.06944 🗂️ Dataset: #data" target="_blank" rel="nofollow noopener">github.com/ai2cm/AIMIP/tr…
English
0
0
2
1K
Ai2
Ai2@allen_ai·
@GoogleResearch @nvidia We also tested the models on harder scenarios, such as a rapidly warming ocean that was unfamiliar from training In those tests, the models diverged much more—showing that generalization remains a major challenge.
English
1
0
3
1K
Ai2
Ai2@allen_ai·
Announcing the AI Model Intercomparison Project (AIMIP), a community effort to support open evaluation of AI climate models. 🌎 It brings together a shared benchmark experiment & dataset to make it easier to compare models side by side over multi-decade simulations. 🧵
Ai2 tweet media
English
1
4
35
5.1K