nfiedel

313 posts

nfiedel

@nfiedel

Katılım Mayıs 2008

213 Takip Edilen308 Takipçiler

nfiedel retweetledi

Demis Hassabis@demishassabis·2 Nis

Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building!

English

326

882

989.3K

nfiedel retweetledi

Armand Joulin@armandjoulin·1 Ağu

Are small models still undertrained? We are releasing a 2B model that beats GPT-3.5. The crazy part is that it was distill on only 2T tokens from a small model. Distillation is the future of LLMs with the growing availability of large and efficient open models!

English

365

62.6K

nfiedel retweetledi

Jascha Sohl-Dickstein@jaschasd·16 Tem

This is an excellent paper, that ties many threads together around scaling models and hyperparameters.

English

10.5K

nfiedel retweetledi

Demis Hassabis@demishassabis·27 Haz

Gemma 2 is available to researchers & developers. At 27B it delivers best-in-class performance for its size and is competitive even to models over twice its size! Proud to continue our tradition of thoughtfully bringing cutting-edge research to the open models ecosystem.

Google DeepMind@GoogleDeepMind

We're excited to unveil Gemma 2. 🛠️ Available in both 9B and 27B parameters, it delivers the best performance for its size - unlocking more possibilities for developers to build and deploy with AI. → dpmd.ai/45Q6yba

English

651

213.1K

nfiedel retweetledi

Clément Farabet@clmt·27 Haz

Gemma 2 is out! As with our first model, we're super focused on creating models at useful, practical sizes, so that they can be easily deployable... all the while being amazing in quality. We upgraded our 9B so that it's truly awesome and best in class across many benchmarks. And we're introducing a brand new 27B, also best at size, and actually stronger than some larger models. Both did real nice on LMSYS. The 27B Gemma 2 model is designed to run inference efficiently at full precision on a single Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU. And of course, this is our open weights model line... enjoy! ai.google.dev/gemma - try it in AI Studio blog.google/technology/dev… More in the tech report => storage.googleapis.com/deepmind-media…

English

320

113.7K

nfiedel retweetledi

Google DeepMind@GoogleDeepMind·14 May

We’re introducing new additions to Gemma: our family of open models built with the same technology as Gemini. 🔘 PaliGemma: a powerful open vision-language model 🔘 Gemma 2: coming soon in various sizes, including 27 billion parameters → dpmd.ai/3QKEteK #GoogleIO

English

120

623

111.3K

nfiedel retweetledi

Clément Farabet@clmt·9 Nis

Gemma is expanding.... we just announced CodeGemma, a version of Gemma tuned for code generation. And bonus... Gemma is now bumped to v1.1, addressing lots of feedback we got. Congrats Gemma team for one more amazing release! developers.googleblog.com/2024/04/gemma-…

English

292

85K

nfiedel@nfiedel·21 Şub

Building Gemma together with an exceptional team has been a delight, and now we're thrilled to share it with the world. A huge congrats to the entire team! Special thanks to Kathleen & Alek, @triswarkentin, @armandjoulin, @clmt – you are all amazing :)

Demis Hassabis@demishassabis

We have a long history of supporting responsible open source & science, which can drive rapid research progress, so we’re proud to release Gemma: a set of lightweight open models, best-in-class for their size, inspired by the same tech used for Gemini blog.google/technology/dev…

English

16.6K

nfiedel retweetledi

Jeff Dean@JeffDean·15 Ağu

The PaLM language model paper is now officially published at JMLR. jmlr.org/papers/v24/22-…

English

105

647

128.2K

nfiedel retweetledi

Jeff Dean@JeffDean·21 Mar

Bard is now available in the US and UK, w/more countries to come. It’s great to see early @GoogleAI work reflected in it—advances in sequence learning, large neural nets, Transformers, responsible AI techniques, dialog systems & more. You can try it at bard.google.com

Sundar Pichai@sundarpichai

We're expanding access to Bard in US + UK with more countries ahead, it's an early experiment that lets you collaborate with generative AI. Hope Bard sparks more creativity and curiosity, and will get better with feedback. Sign up: bard.google.com blog.google/technology/ai/…

English

117

709

339.7K

nfiedel retweetledi

Aleksandra Faust@AleksandraFaust·11 Eki

Common HTML understanding tasks can be done without custom NN architecture design and with orders of magnitude less data by fine-tuning LLMs. Bidirectional attention appears to be crucial, and context windows remain the bottleneck.

Ofir Nachum@ofirnachum

"Understanding HTML with Large Language Models" Our newest work shows that LLMs pretrained on standard text corpora transfer remarkably well to web-based tasks. We achieve a new SOTA on supervised MiniWoB: 50% better perf with 200x less data than prev best arxiv.org/abs/2210.03945

English

nfiedel retweetledi

Jason Baldridge@jasonbaldridge·23 Ağu

Exciting news: #Parti and #Imagen teamed up to create a hybrid system with Parti creating 256x256 images which then recieve Imagen super resolution to produce 1024x1024 pixels! See the diagram below for how it works. See thread for more info and new images with this system!

English

104

551

nfiedel retweetledi

Zoubin Ghahramani@ZoubinGhahrama1·17 Ağu

What happens when you combine the best of language models with robots that operate in the real world? Take a look at our new work from @GoogleAI and Everyday Robots!

Google AI@GoogleAI

Learn how we combined our latest language model, PaLM, with robot learning algorithms to create PaLM-SayCan, a robotics system that uses natural language to complete complex tasks in a real-world environment → goo.gle/3QRJhgl

English

100

nfiedel retweetledi

Jascha Sohl-Dickstein@jaschasd·10 Haz

After 2 years of work by 442 contributors across 132 institutions, I am thrilled to announce that the github.com/google/BIG-ben… paper is now live: arxiv.org/abs/2206.04615. BIG-bench consists of 204 diverse tasks to measure and extrapolate the capabilities of large language models.

English

541

2.4K

nfiedel@nfiedel·25 May

Tool Augmented Language Models. abs: arxiv.org/abs/2205.12255 Smaller tool augmented models outperform larger non-augmented models in two domains (thus far), and on out-of-distribution examples. Great collaboration w/@AlwaysParisi and @YaoZhaoAI!

English

nfiedel@nfiedel·4 Nis

Am so proud of the team’s exceptional research & engineering over the past year+! We are excited to share PaLM 🌴 with the world! The paper is at: goo.gle/palm-paper

Aakanksha Chowdhery@achowdhery

Really excited to present the first large-scale use of Pathways system! Joint work with so many of colleagues at Google! @sharan0909 @nfiedel @JeffDean @m_isard @ada_rob @bsaeta .

English

nfiedel retweetledi

Adam Roberts@ada_rob·3 Kas

For a year, the T5 team has collab'd with FLAX and JAX to build a successor to our research library, using it to train models at many scales 📈 on TPU... ...and now you can too! T5X is still in rapid development, but you can use it or find inspiration at goo.gle/t5x!

English

281

nfiedel retweetledi

Brian Lester@blester125·3 Eyl

My first @GoogleAI residency project was accepted to @emnlpmeeting #EMNLP2021! Prompt Tuning can condition a frozen T5 XXL model to perform new tasks while only adding 0.003% more parameters and no performance loss. Camera Ready 📸: arxiv.org/abs/2104.08691 Quick Thread 🧵(1/7)

English

260

nfiedel retweetledi

Jascha Sohl-Dickstein@jaschasd·27 Oca

CALL FOR TASKS CAPTURING LIMITATIONS OF LARGE LANGUAGE MODELS We are soliciting contributions of tasks to a *collaborative* benchmark designed to measure and extrapolate the capabilities and limitations of large language models. Submit tasks at github.com/google/BIG-Ben… #BIGbench

English

266

nfiedel retweetledi

Hanoi Hantrakul@yaboihanoi·8 May

🏠bored at home? take a saxy AI solo 🎷 #madewithmagenta #tonetransfer try it yourself g.co/magenta/ddsp-d…

English

198

709

Keşfet

@triswarkentin @armandjoulin @clmt @GoogleAI @AlwaysParisi @YaoZhaoAI @emnlpmeeting @elonmusk