nfiedel

313 posts

nfiedel

nfiedel

@nfiedel

Katılım Mayıs 2008
213 Takip Edilen308 Takipçiler
nfiedel retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building!
Demis Hassabis tweet media
English
326
882
8K
989.3K
nfiedel retweetledi
Armand Joulin
Armand Joulin@armandjoulin·
Are small models still undertrained? We are releasing a 2B model that beats GPT-3.5. The crazy part is that it was distill on only 2T tokens from a small model. Distillation is the future of LLMs with the growing availability of large and efficient open models!
Armand Joulin tweet media
English
10
39
365
62.6K
nfiedel retweetledi
Jascha Sohl-Dickstein
Jascha Sohl-Dickstein@jaschasd·
This is an excellent paper, that ties many threads together around scaling models and hyperparameters.
English
3
3
54
10.5K
nfiedel retweetledi
Demis Hassabis
Demis Hassabis@demishassabis·
Gemma 2 is available to researchers & developers. At 27B it delivers best-in-class performance for its size and is competitive even to models over twice its size! Proud to continue our tradition of thoughtfully bringing cutting-edge research to the open models ecosystem.
Google DeepMind@GoogleDeepMind

We're excited to unveil Gemma 2. 🛠️ Available in both 9B and 27B parameters, it delivers the best performance for its size - unlocking more possibilities for developers to build and deploy with AI. → dpmd.ai/45Q6yba

English
37
87
651
213.1K
nfiedel retweetledi
Clément Farabet
Clément Farabet@clmt·
Gemma 2 is out! As with our first model, we're super focused on creating models at useful, practical sizes, so that they can be easily deployable... all the while being amazing in quality. We upgraded our 9B so that it's truly awesome and best in class across many benchmarks. And we're introducing a brand new 27B, also best at size, and actually stronger than some larger models. Both did real nice on LMSYS. The 27B Gemma 2 model is designed to run inference efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. And of course, this is our open weights model line... enjoy! ai.google.dev/gemma - try it in AI Studio blog.google/technology/dev… More in the tech report => storage.googleapis.com/deepmind-media…
Clément Farabet tweet media
English
10
77
320
113.7K
nfiedel retweetledi
Google DeepMind
Google DeepMind@GoogleDeepMind·
We’re introducing new additions to Gemma: our family of open models built with the same technology as Gemini. 🔘 PaliGemma: a powerful open vision-language model 🔘 Gemma 2: coming soon in various sizes, including 27 billion parameters → dpmd.ai/3QKEteK #GoogleIO
Google DeepMind tweet media
English
27
120
623
111.3K
nfiedel retweetledi
Clément Farabet
Clément Farabet@clmt·
Gemma is expanding.... we just announced CodeGemma, a version of Gemma tuned for code generation. And bonus... Gemma is now bumped to v1.1, addressing lots of feedback we got. Congrats Gemma team for one more amazing release! developers.googleblog.com/2024/04/gemma-…
English
13
49
292
85K
nfiedel
nfiedel@nfiedel·
Building Gemma together with an exceptional team has been a delight, and now we're thrilled to share it with the world. A huge congrats to the entire team! Special thanks to Kathleen & Alek, @triswarkentin, @armandjoulin, @clmt – you are all amazing :)
Demis Hassabis@demishassabis

We have a long history of supporting responsible open source & science, which can drive rapid research progress, so we’re proud to release Gemma: a set of lightweight open models, best-in-class for their size, inspired by the same tech used for Gemini blog.google/technology/dev…

English
3
6
46
16.6K
nfiedel retweetledi
Jeff Dean
Jeff Dean@JeffDean·
Bard is now available in the US and UK, w/more countries to come. It’s great to see early @GoogleAI work reflected in it—advances in sequence learning, large neural nets, Transformers, responsible AI techniques, dialog systems & more. You can try it at bard.google.com
Sundar Pichai@sundarpichai

We're expanding access to Bard in US + UK with more countries ahead, it's an early experiment that lets you collaborate with generative AI. Hope Bard sparks more creativity and curiosity, and will get better with feedback. Sign up: bard.google.com blog.google/technology/ai/…

English
27
117
709
339.7K
nfiedel retweetledi
Aleksandra Faust
Aleksandra Faust@AleksandraFaust·
Common HTML understanding tasks can be done without custom NN architecture design and with orders of magnitude less data by fine-tuning LLMs. Bidirectional attention appears to be crucial, and context windows remain the bottleneck.
Ofir Nachum@ofirnachum

"Understanding HTML with Large Language Models" Our newest work shows that LLMs pretrained on standard text corpora transfer remarkably well to web-based tasks. We achieve a new SOTA on supervised MiniWoB: 50% better perf with 200x less data than prev best arxiv.org/abs/2210.03945

English
1
6
31
0
nfiedel retweetledi
Jason Baldridge
Jason Baldridge@jasonbaldridge·
Exciting news: #Parti and #Imagen teamed up to create a hybrid system with Parti creating 256x256 images which then recieve Imagen super resolution to produce 1024x1024 pixels! See the diagram below for how it works. See thread for more info and new images with this system!
Jason Baldridge tweet media
English
7
104
551
0
nfiedel
nfiedel@nfiedel·
Tool Augmented Language Models. abs: arxiv.org/abs/2205.12255 Smaller tool augmented models outperform larger non-augmented models in two domains (thus far), and on out-of-distribution examples. Great collaboration w/@AlwaysParisi and @YaoZhaoAI!
English
1
2
12
0
nfiedel retweetledi
Adam Roberts
Adam Roberts@ada_rob·
For a year, the T5 team has collab'd with FLAX and JAX to build a successor to our research library, using it to train models at many scales 📈 on TPU... ...and now you can too! T5X is still in rapid development, but you can use it or find inspiration at goo.gle/t5x!
English
4
70
281
0
nfiedel retweetledi
Brian Lester
Brian Lester@blester125·
My first @GoogleAI residency project was accepted to @emnlpmeeting #EMNLP2021! Prompt Tuning can condition a frozen T5 XXL model to perform new tasks while only adding 0.003% more parameters and no performance loss. Camera Ready 📸: arxiv.org/abs/2104.08691 Quick Thread 🧵(1/7)
Brian Lester tweet media
English
5
35
260
0
nfiedel retweetledi
Jascha Sohl-Dickstein
Jascha Sohl-Dickstein@jaschasd·
CALL FOR TASKS CAPTURING LIMITATIONS OF LARGE LANGUAGE MODELS We are soliciting contributions of tasks to a *collaborative* benchmark designed to measure and extrapolate the capabilities and limitations of large language models. Submit tasks at github.com/google/BIG-Ben… #BIGbench
Jascha Sohl-Dickstein tweet media
English
12
68
266
0