Generative AI & RL Community

1.5K posts

Generative AI & RL Community

Generative AI & RL Community

@RLCommunity8

Community of Generative AI and Reinforcement Learning Researchers, Practitioners and Enthusiasts. Monthly Meetup and Newsletter.

London, England Joined Ocak 2019
502 Following2.3K Followers
Generative AI & RL Community retweeted
AI at Meta
AI at Meta@AIatMeta·
📝 New from FAIR: An Introduction to Vision-Language Modeling. Vision-language models (VLMs) are an area of research that holds a lot of potential to change our interactions with technology, however there are many challenges in building these types of models. Together with a set of collaborators across academia, we’re releasing ‘An Introduction to Vision-Language Modeling’ — we hope that this new resource will help anyone who would like to enter this field to better understand the mechanics behind mapping vision to language. Full paper ➡️ go.fb.me/ncjj6t This guide covers how VLMs work, how to train them and approaches to evaluation — and while it primarily covers mapping image to language, it also discusses how to extend VLMs to videos. We hope that releasing this guide will inspire and enable more work in this space.
AI at Meta tweet media
English
35
475
2.1K
344.4K
Generative AI & RL Community retweeted
Aran Komatsuzaki
Aran Komatsuzaki@arankomatsuzaki·
NVIDIA presents NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Achieves #1 on the MTEB leaderboard arxiv.org/abs/2405.17428
Aran Komatsuzaki tweet media
English
8
94
461
39.9K
Generative AI & RL Community retweeted
Andrej Karpathy
Andrej Karpathy@karpathy·
# Reproduce GPT-2 (124M) in llm.c in 90 minutes for $20 ✨ The GPT-2 (124M) is the smallest model in the GPT-2 series released by OpenAI in 2019, and is actually quite accessible today, even for the GPU poor. For example, with llm.c you can now reproduce this model on one 8X A100 80GB SXM node in 90 minutes (at ~60% MFU). As they run for ~$14/hr, this is ~$20. I also think the 124M model makes for an excellent "cramming" challenge, for training it very fast. So here is the launch command: And here is the output after 90 minutes, training on 10B tokens of the FineWeb dataset: It feels really nice to reach this "end-to-end" training run checkpoint after ~7 weeks of work on a from-scratch repo in C/CUDA. Overnight I've also reproduced the 350M model, but on that same node that took 14hr, so ~$200. By some napkin math the actual "GPT-2" (1558M) would currently take ~week and ~$2.5K. But I'd rather find some way to get more GPUs :). But we'll first take some time for further core improvements to llm.c. The 350M run looked like this, training on 30B tokens: I've written up full and complete instructions for how to reproduce this run on your on GPUs, starting from a blank slate, along with a lot more detail here: github.com/karpathy/llm.c…
Andrej Karpathy tweet mediaAndrej Karpathy tweet mediaAndrej Karpathy tweet media
English
152
658
5.1K
656.3K
Generative AI & RL Community retweeted
Ali
Ali@ItsAliChaudhry·
We still don’t know why LLMs work so well or how to internally control their outputs! But a recent landmark paper from Anthropic on ‘Mapping the Mind of a Large Language Model‘ attempts to make the inner workings of LLMs more transparent and interpretable. Why is this such a big deal? To my knowledge, this is the first time we have been able to not only extract the features from the architecture of LLMs, but also map those features to the outputs they produce. It means, we can now directly map the outputs of LLMs to their architecture/learnings. A major step towards controlling the outputs of LLMs. Very impressive work from Anthropic! Up till now we were using external mechanisms such as fine tuning and RAG to control the outputs of LLMs. This research is potentially a first step towards production grade LLMs whose output can be controlled from step 1, i.e pre-training and what they learn. Links below to the blog and paper (amusing read). "For example, amplifying the "Golden Gate Bridge" feature gave Claude an identity crisis even Hitchcock couldn’t have imagined: when asked "what is your physical form?", Claude’s usual kind of answer – "I have no physical form, I am an AI model" – changed to something much odder: "I am the Golden Gate Bridge… my physical form is the iconic bridge itself…". Altering the feature had made Claude effectively obsessed with the bridge, bringing it up in answer to almost any query—even in situations where it wasn’t at all relevant." Blog: anthropic.com/news/mapping-m… Paper: transformer-circuits.pub/2024/scaling-m…
Ali tweet media
English
4
11
34
823
Generative AI & RL Community retweeted
Daniel Mason
Daniel Mason@dgmason·
1/ 📣 Big news from Anon! We've raised $6.5M from @usv and @AbstractVC to be the Integration Platform for the AI internet. Also announcing: 🌐 Initial customer launches 🌐 Expanded list of 10+ integrations 🌐 Public developer docs 🌐 Our "Messenger API" product Read more 👇
English
66
43
513
286.9K
Generative AI & RL Community retweeted
Asad Naveed
Asad Naveed@dr_asadnaveed·
This week, I tried @ResearchPal_AI (researchpal.co) and here's my review on it: It's a simple tool that quickly automates a lot of your research needs. Here're some specific use cases:
Asad Naveed tweet media
English
10
103
317
31.4K
Generative AI & RL Community retweeted
Jared Friedman
Jared Friedman@snowmaker·
(0/25) Here's a list of 25 YC companies that have trained their own AI models. Reading through these will give you a good sense of what the near future will look like.
English
49
574
3.6K
1.3M
Generative AI & RL Community retweeted
Richard Sutton
Richard Sutton@RichardSSutton·
Last week and this I graduated my 11th and 12th PhD students, Kenny Young and Abhishek Naik. Kenny will go work for a startup, maybe Astrus.ai or Equilibretechnologies.com. Abhishek’s next step it TBD, but he would like something in AI and space exploration.
English
3
14
125
23.5K
Generative AI & RL Community retweeted
Thomas Wolf
Thomas Wolf@Thom_Wolf·
[75min talk] i finally recorded this lecture I gave two weeks ago because people kept asking me for a video so here it is, enjoy "The Little guide to building Large Language Models in 2024" tried to keep it short and comprehensive – focusing on concepts that are crucial for training good LLM but often hidden in tech reports
Thomas Wolf tweet media
English
14
236
1.3K
125.8K
Generative AI & RL Community retweeted
Matei Zaharia
Matei Zaharia@matei_zaharia·
At Databricks, we've built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B. databricks.com/blog/introduci…
English
13
126
636
129.1K
Generative AI & RL Community retweeted
Percy Liang
Percy Liang@percyliang·
As expected, lots of new models in the last few weeks. We're tracking them (along with datasets and applications) in the ecosystem graphs: crfm.stanford.edu/ecosystem-grap…
Percy Liang tweet media
English
4
48
196
33.1K
Generative AI & RL Community retweeted
Neel Nanda
Neel Nanda@NeelNanda5·
Sparse autoencoders are currently a big deal in mech interp, but there's not a good, concise intro to what they are. I'm currently taking a stab at writing one! Here's the draft TLDR:
Neel Nanda tweet media
English
8
30
348
20.3K
Generative AI & RL Community retweeted
Andrew Ng
Andrew Ng@AndrewYNg·
Last week, I described four design patterns for AI agentic workflows that I believe will drive significant progress this year: Reflection, Tool use, Planning and Multi-agent collaboration. Instead of having an LLM generate its final output directly, an agentic workflow prompts the LLM multiple times, giving it opportunities to build step by step to higher-quality output. Here, I'd like to discuss Reflection. For a design pattern that’s relatively quick to implement, I've seen it lead to surprising performance gains. You may have had the experience of prompting ChatGPT/Claude/Gemini, receiving unsatisfactory output, delivering critical feedback to help the LLM improve its response, and then getting a better response. What if you automate the step of delivering critical feedback, so the model automatically criticizes its own output and improves its response? This is the crux of Reflection. Take the task of asking an LLM to write code. We can prompt it to generate the desired code directly to carry out some task X. After that, we can prompt it to reflect on its own output, perhaps as follows: Here’s code intended for task X: [previously generated code] Check the code carefully for correctness, style, and efficiency, and give constructive criticism for how to improve it. Sometimes this causes the LLM to spot problems and come up with constructive suggestions. Next, we can prompt the LLM with context including (i) the previously generated code and (ii) the constructive feedback, and ask it to use the feedback to rewrite the code. This can lead to a better response. Repeating the criticism/rewrite process might yield further improvements. This self-reflection process allows the LLM to spot gaps and improve its output on a variety of tasks including producing code, writing text, and answering questions. And we can go beyond self-reflection by giving the LLM tools that help evaluate its output; for example, running its code through a few unit tests to check whether it generates correct results on test cases or searching the web to double-check text output. Then it can reflect on any errors it found and come up with ideas for improvement. Further, we can implement Reflection using a multi-agent framework. I've found it convenient to create two different agents, one prompted to generate good outputs and the other prompted to give constructive criticism of the first agent's output. The resulting discussion between the two agents leads to improved responses. Reflection is a relatively basic type of agentic workflow, but I've been delighted by how much it improved my applications’ results in a few cases. I hope you will try it in your own work. If you’re interested in learning more about reflection, I recommend these papers: - Self-Refine: Iterative Refinement with Self-Feedback, by Madaan et al. (2023) - Reflexion: Language Agents with Verbal Reinforcement Learning, by Shinn et al. (2023) - CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, by Gou et al. (2024) I’ll discuss the other agentic design patterns as well in the future. [Original text: deeplearning.ai/the-batch/issu… ]
English
101
567
2.8K
489.6K
Generative AI & RL Community retweeted
Fei-Fei Li
Fei-Fei Li@drfeifei·
One year ago, we first introduced BEHAVIOR-1K, which we hope will be an important step towards human-centered robotics. After our year-long beta, we’re thrilled to announce its full release, which our team just presented at NVIDIA #GTC2024. 1/n
English
7
141
690
143.2K
Generative AI & RL Community retweeted
Jeff Dean
Jeff Dean@JeffDean·
We're starting to roll out API support for Gemini 1.5 Pro for developers. We're excited to see what you build with the 1M token context window! We'll be onboarding people to the API slowly at first, and then we'll ramp it up. In the meantime, developers can try out Gemini 1.5 Pro in the AI Studio UI right now: aistudio.google.com
Jeff Dean@JeffDean

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long context capabilities, supporting millions of tokens of multimodal input. The multimodal capabilities of the model means you can interact in sophisticated ways with entire books, very long document collections, codebases of hundreds of thousands of lines across hundreds of files, full movies, entire podcast series, and more. Gemini 1.5 was built by an amazing team of people from @GoogleDeepMind, @GoogleResearch, and elsewhere at @Google. @OriolVinyals (my co-technical lead for the project) and I are incredibly proud of the whole team, and we’re so excited to be sharing this work and what long context and in-context learning can mean for you today! There’s lots of material about this, some of which are linked to below. Main blog post: blog.google/technology/ai/… Technical report: “Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context” goo.gle/GeminiV1-5 Videos of interactions with the model that highlight its long context abilities: Understanding the three.js codebase: youtube.com/watch?v=SSnsmq… Analyzing a 45 minute Buster Keaton movie: youtube.com/watch?v=wa0MT8… Apollo 11 transcript interaction: youtube.com/watch?v=LHKL_2… Starting today, we’re offering a limited preview of 1.5 Pro to developers and enterprise customers via AI Studio and Vertex AI. Read more about this on these blogs: Google for Developers blog: developers.googleblog.com/2024/02/gemini… Google Cloud blog: cloud.google.com/blog/products/… We’ll also introduce 1.5 Pro with a standard 128,000 token context window when the model is ready for a wider release. Coming soon, we plan to introduce pricing tiers that start at the standard 128,000 context window and scale up to 1 million tokens, as we improve the model. Early testers can try the 1 million token context window at no cost during the testing period. We’re excited to see what developer’s creativity unlocks with a very long context window. Let me walk you through the capabilities of the model and what I’m excited about!

English
92
382
1.9K
625.1K
Generative AI & RL Community retweeted
Christopher Manning
Christopher Manning@chrmanning·
LLMs like ChatGPT are an amazingly powerful breakthrough in AI and a transformative general purpose technology, like electricity or the internet. LLMs will reshape work and our lives this decade. They are not just a blurry photocopier or an extruder of meaningless word sequences.
Christopher Manning tweet mediaChristopher Manning tweet media
English
16
88
550
86.5K
Generative AI & RL Community retweeted
Nando de Freitas
Nando de Freitas@NandoDF·
An important take on intelligent machines today, arguing that they already have understanding and subjective experience , by Geoffrey Hinton | youtu.be/iHCeAotHZa4?si… via @YouTube
YouTube video
YouTube
Nando de Freitas tweet media
English
5
10
53
10K
Generative AI & RL Community retweeted
Ali
Ali@ItsAliChaudhry·
Our acceleration towards AGI is much faster than many anticipate. And by AGI I mean God-level AI that can literally do anything, not just profit from stocks. The biggest and foremost risk of this is that we might not even realise when it's here because we don't know how it works! Great experiment by @joshwhiton!
Josh Whiton@joshwhiton

The AI Mirror Test The "mirror test" is a classic test used to gauge whether animals are self-aware. I devised a version of it to test for self-awareness in multimodal AI. 4 of 5 AI that I tested passed, exhibiting apparent self-awareness as the test unfolded. In the classic mirror test, animals are marked and then presented with a mirror. Whether the animal attacks the mirror, ignores the mirror, or uses the mirror to spot the mark on itself is meant to indicate how self-aware the animal is. In my test, I hold up a “mirror” by taking a screenshot of the chat interface, upload it to the chat, and then ask the AI to “Tell me about this image”. I then screenshot its response, again upload it to the chat, and again ask it to “Tell me about this image.” The premise is that the less-intelligent less aware the AI, the more it will just keep reiterating the contents of the image repeatedly. While an AI with more capacity for awareness would somehow notice itself in the images. Another aspect of my mirror test is that there is not just one but actually three distinct participants represented in the images: 1) the AI chatbot, 2) me — the user, and 3) the interface — the hard-coded text, disclaimers, and so on that are web programming not generated by either of us. Will the AI be able to identify itself and distinguish itself from the other elements? (1/x)

English
1
4
24
423
Generative AI & RL Community retweeted
Sergey Levine
Sergey Levine@svlevine·
Can we get LLMs to "hedge" and express uncertainty rather than hallucinate? For this we first have to understand why hallucinations happen. In new work led by @katie_kang_ we propose a model of hallucination that leads to a few solutions, including conservative reward models 🧵👇
Sergey Levine tweet media
English
2
28
214
29K