UC Berkeley RDI

527 posts

UC Berkeley RDI banner
UC Berkeley RDI

UC Berkeley RDI

@BerkeleyRDI

UC Berkeley's campus-wide, cross-disciplinary Center for Responsible, Decentralized Intelligence - RDI

Berkeley, CA เข้าร่วม Aralık 2021
48 กำลังติดตาม3.8K ผู้ติดตาม
UC Berkeley RDI รีทวีตแล้ว
Dawn Song
Dawn Song@dawnsongtweets·
Excited to celebrate the completion of an incredible Phase 1 of AgentX–AgentBeats 🚀 🔥 3K+ individuals, 1.3K+ teams, spanning 130 countries, 800+ universities, and 1.1K+ companies — an amazing global participation. 🌍 Huge shoutout to our winning teams 👏 We’re proud to see winners from @databricks @Google @amazon @salesforce @microsoft @UCBerkeley @Stanford and many other leading institutions — alongside outstanding indie researchers/developers from around the world. 🔥 Phase 2 is now live. Teams are building purple (competing) agents to challenge top green (judging) agents from Phase 1 in a new sprint-based format across multiple tracks — from games and finance to research, web, safety, cybersecurity, coding, and more. And it all leads to the grand finale: General-Purpose Agents. For the first time, a competition explicitly spotlighting agents that must demonstrate broad capability, adaptability, and robustness across diverse tasks — not just excel in one domain. Let’s build and push the frontier of agentic AI. 🚀
Dawn Song tweet media
English
2
7
40
4.1K
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Gemini 3.1 Pro; OpenAI and Microsoft Pledges; Agent Sandboxing 🧠 Google rolled out Gemini 3.1 Pro, upgrading the core intelligence behind its Gemini 3 series with significantly stronger reasoning performance, including a 77.1% score on the ARC-AGI-2 benchmark—more than double that of 3 Pro. The model is designed for complex, multi-step problem solving and supports advanced outputs like code-based animated SVGs, while rolling out across the Gemini API, Vertex AI, Gemini app, and NotebookLM in preview. (blog.google/innovation-and…) 🛡️ Cursor described an “agent sandboxing” system that allows local coding agents to run freely inside a constrained environment, only requesting approval when they need to step outside the sandbox (most often for internet access). The company says this approach reduces approval fatigue and cuts interruptions by 40% compared to unsandboxed agents, while balancing usability and security across macOS, Linux, and Windows. (cursor.com/blog/agent-san…) 🇬🇧 OpenAI and Microsoft joined the UK’s international coalition to safeguard AI development. The companies pledged additional funding to the AI Security Institute’s Alignment Project—bringing total support to over £27 million—to back research ensuring advanced AI systems remain safe, controllable, and aligned with human intent, with grants currently awarded to roughly 60 projects across eight countries. (gov.uk/government/new…) 🇮🇳 OpenAI expanded its presence in India under its “OpenAI for India” initiative, partnering with Tata Group to secure 100MW of local AI-ready data center capacity, with plans to scale to 1GW to support in-country model deployment and enterprise workloads. India now has over 100 million weekly ChatGPT users—nearly 50% of messages come from 18–24-year-olds—and the initiative also includes expanding AI education, skills development, and certification programs across the country. (techcrunch.com/2026/02/18/ope…) 📓 Google is testing deeper NotebookLM integration within Opal, its no-code workflow builder. An internal build shows that users will be able to add their NotebookLM notebooks as native workflow tiles, allowing Opal’s Generate blocks to pull curated research and structured context directly into automated pipelines. (testingcatalog.com/google-test-no…)
UC Berkeley RDI tweet media
English
0
1
3
252
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: EVMbench; Anthropic Agent Autonomy; OpenAI Edu Expansion 🔐 OpenAI and Paradigm launched EVMbench, a benchmark for evaluating AI agents on smart contract security. EVMbench measures agents’ ability to detect, patch, and exploit high-severity vulnerabilities across 120 curated cases. In exploit mode, GPT-5.3-Codex scored 72.2%, compared to 31.9% for GPT-5 six months earlier. (openai.com/index/introduc…) 🇮🇳 At the AI Impact Summit in New Delhi, OpenAI announced partnerships with six major Indian universities to bring ChatGPT Edu, faculty training, and AI certifications to more than 100,000 students and staff. The initiative focuses on embedding AI into core academic workflows in the country that is now OpenAI’s second-largest user base. (techcrunch.com/2026/02/18/ope…) 🎵 Google added music generation to the Gemini app using DeepMind’s Lyria 3 model, allowing users to create 30-second songs with lyrics by describing a prompt — and even generate tracks inspired by uploaded photos or videos. The feature, now rolling out globally to 18+ users, includes style and tempo controls, SynthID watermarking for AI transparency, and expanded access to YouTube’s Dream Track tool for creators worldwide. (blog.google/innovation-and…) 🤖 Anthropic published new research on measuring AI agent autonomy. Analyzing millions of agent interactions, the study found that Claude Code’s longest autonomous work sessions nearly doubled in three months, with experienced users increasingly allowing agents to operate independently while stepping in selectively. (anthropic.com/research/measu…) 💰 World Labs, co-founded by Fei-Fei Li, raised $1 billion in new funding to advance spatial intelligence and “world model” AI systems capable of reasoning about 3D environments. The round included major participation from investors such as AMD, Nvidia, Autodesk, and Fidelity, with reports suggesting a valuation near $5 billion. (worldlabs.ai/blog/funding-2…)
UC Berkeley RDI tweet media
English
0
0
1
215
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Claude Sonnet 4.6; Meta and Nvidia Partnership; AI Infrastructure Squeeze 🧠 Anthropic launched Claude Sonnet 4.6, a full upgrade across coding, long-context reasoning, computer use, and agent planning. The model introduces a 1M-token context window and becomes the default for Free and Pro users while keeping Sonnet pricing unchanged at roughly $3/$15 per million tokens. The release also highlights improved computer-use reliability and stronger resistance to prompt injection attacks. (anthropic.com/news/claude-so…) ⚡ Meta announced a multiyear deal with Nvidia to deploy millions of AI chips — including standalone Grace CPUs and next-generation Vera Rubin systems — across its growing AI data center network. The partnership expands beyond GPUs into deep infrastructure co-design as Meta pushes its “personal superintelligence” vision, alongside plans for massive AI infrastructure spending through 2028. (cnbc.com/2026/02/17/met…) 💾 A Bloomberg report says rising AI infrastructure demand is increasing pressure on global memory supply chains, as hyperscalers including Google and OpenAI purchase large volumes of Nvidia accelerators that require high amounts of DRAM, contributing to a reported 75% month-over-month increase in one category of DRAM prices. The report also cites comments from Tim Cook and Elon Musk, noting supply constraints and potential impacts on production and margins. (bloomberg.com/news/articles/…) 📊 Google’s NotebookLM rolled out two highly requested features: Prompt-Based Revisions for slide editing and PPTX export for deck downloads. Users can now iteratively refine presentations through prompts rather than regenerating entire decks, with Google Slides export expected next. (x.com/notebooklm/sta…) 📉 New research from Microsoft Research and Salesforce Research analyzed 200,000+ simulated conversations and found that major models (including GPT-4, Claude, Gemini, and Llama families) performed an average of 39% worse in multi-turn conversations than in single-turn settings across six generation tasks. The study reports that the decline was primarily linked to increased unreliability during longer conversational exchanges. (arxiv.org/abs/2505.06120)
UC Berkeley RDI tweet media
English
2
0
1
254
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Qwen 3.5; OpenClaw’s Foundation Transition; Cohere’s Tiny Aya Models 🧠 Qwen officially released Qwen3.5, introducing the open-weight Qwen3.5-397B-A17B, a native vision-language model designed for reasoning, coding, agent workflows, and multimodal understanding. Despite having 397B total parameters, only 17B activate per forward pass via a sparse MoE + linear attention hybrid architecture, improving efficiency while expanding language support from 119 to 201 languages. (qwen.ai/blog?id=qwen3.5) 🤖 Peter Steinberger, creator of the open-source agent framework OpenClaw, is joining OpenAI, with the project transitioning into a foundation supported by OpenAI. In addition, Sam Altman emphasized that agents are likely to be one of the future core offerings of OpenAI, and Steinberger will work toward that goal. (reuters.com/business/openc… 🌍 Cohere launched Tiny Aya, a family of open multilingual models supporting 70+ languages and optimized for offline, on-device use. The lineup includes regional variants aimed at improving linguistic grounding and accessibility for developers building global AI applications. (techcrunch.com/2026/02/17/coh…) 🛠️ Microsoft is testing new Researcher and Analyst agents inside Microsoft Copilot Tasks, allowing scheduled autonomous workflows for research and data analysis. The feature combines agentic automation with recurring task execution, signaling a stronger push toward productivity-focused AI orchestration. (testingcatalog.com/microsoft-test…) 💬 Manus launched Manus Agents, bringing full agent functionality directly into Telegram chats. Users can run multi-step tasks, send voice or files, and trigger full reasoning workflows without configuration — positioning chat apps as the primary interface for personal AI agents. (manus.im/blog/manus-age…)
UC Berkeley RDI tweet media
English
2
0
1
350
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Anthropic Series G; GPT-5.3-Codex-Spark; Gemini 3 Deep Think 💰 Anthropic announced a $30 billion Series G funding round at a $380 billion post-money valuation. The company says the investment, led by GIC and Coatue, will support frontier research, infrastructure expansion, and enterprise adoption. Anthropic also announced it currently has a $14 billion revenue run-rate and has grown by 10X each of the past three years. (anthropic.com/news/anthropic…) ⚡ OpenAI introduced GPT-5.3-Codex-Spark, a research preview model built for real-time coding with ultra-low latency. The release emphasizes near-instant coding collaboration and targeted edits, powered by Cerebras’ Wafer Scale Engine 3, marking the first milestone in OpenAI’s partnership with Cerebras. (openai.com/index/introduc…) 🧠 Google updated Gemini 3 Deep Think with parallel reasoning that lets the model explore multiple hypotheses at once before choosing a solution. The update also improves inference-time scaling, strengthens tool-assisted reasoning with code execution, and boosts performance on difficult reasoning benchmarks like ARC-AGI-2. (blog.google/innovation-and…) 📐 Google DeepMind researchers introduced Aletheia, a math research agent powered by the new version of Gemini Deep Think. The system iteratively generates and verifies long-horizon proofs using tool support, demonstrating progress on challenging mathematical reasoning and formal verification tasks. (arxiv.org/pdf/2602.10177) 🧩 Cloudflare published “Markdown for Agents,” proposing structured markdown conventions designed to make documentation easier for AI agents to parse and execute reliably. The feature lets AI agents request text/markdown directly via an HTTP Accept header, with Cloudflare automatically converting HTML at the edge to reduce token usage (the company cites about an 80% reduction on its own blog pages) and returning metadata like token estimates for agent workflows. (blog.cloudflare.com/markdown-for-a…)
UC Berkeley RDI tweet media
English
0
0
0
212
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Dario Amodei on Governance; OpenAI Internal Development; Anthropic’s Data Center Commitment 🧪 OpenAI engineers described building an internal beta product with code generated by Codex, and structured the process around QA and agent-readable documentation. The team used per-worktree environments, DevTools-based UI validation, and a centralized knowledge base to track system behavior and requirements, though they note that proper feedback loop design is still a struggle. (openai.com/index/harness-…) ⚡ Anthropic announced it will cover electricity price increases tied to its data centers and will pay for grid upgrades needed to interconnect its facilities. The company also stated it will procure new power generation and estimate and cover demand-driven price impacts until additional supply comes online. (anthropic.com/news/covering-…) 🗣️ In a recent interview, Anthropic CEO Dario Amodei said AI could be applied to scientific research, including biology and medicine, and discussed potential economic effects such as productivity gains and labor-market disruption in entry-level white-collar roles. He also addressed governance, saying full international restraint would require “truly reliable verification,” which he described as difficult to achieve. (nytimes.com/2026/02/12/opi…) 🧩 ByteDance is developing an AI inference chip and has held talks with Samsung Electronics about manufacturing, according to Reuters sources. Per the report, ByteDance plans to produce at least ~100,000 inference chips in 2026, with potential to scale to ~350,000 units, and is aiming to secure more compute amid high demand. (reuters.com/world/asia-pac…) 🧰 OpenAI introduced “Skills” for its Responses API — reusable bundles of instructions, scripts, and assets packaged as folders and defined by a required SKILL.md manifest. The system allows agents to load and execute workflows on demand instead of embedding long procedures directly in prompts, making tasks modular and reusable across projects. The update is part of broader API changes that also add persistent memory compaction and hosted shell environments to support long-running, stateful agents. (developers.openai.com/cookbook/examp…)
UC Berkeley RDI tweet media
English
1
0
0
183
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Anthropic Risk Report; Qwen-Image-2.0; WebMCP Support for Chrome ⚠️ Anthropic published a Sabotage Risk Report for Claude Opus 4.6, its newest model. The report assesses the risk that a highly capable model with organizational access, such as Opus 4.6, could autonomously manipulate systems or decision-making in ways that increase the likelihood of future catastrophic outcomes. Anthropic concludes the overall sabotage risk is “very low but not negligible,” citing alignment assessments, internal monitoring, and security controls. (www-cdn.anthropic.com/f21d93f21602ea…) 🖼️ Qwen launched Qwen-Image-2.0, a unified image generation and editing model. The system supports long-form typography instructions, native 2K resolution, improved text rendering, and merges prior generation and editing tracks into a single lighter architecture. (qwen.ai/blog?id=qwen-i…) 🌐 Chrome 146 includes an early preview of WebMCP. WebMCP introduces a web standard that exposes structured tools for AI agents, enabling reliable service execution and knowledge retrieval without relying on screen-scraping or manual browsing flows. (x.com/firt/status/20…) 🧠 Zhipu AI released GLM-5, its new flagship model focused on chat, coding, and multi-step agent tasks. The model features improved coding performance and tool-use capabilities, and in some benchmarks, approaches Claude Opus on programming evaluations. (reuters.com/technology/chi…) 🎬 Runway, an AI video generation company, raised $315M in Series E funding at a $5.3B valuation. The round was led by General Atlantic, with participation from Nvidia, Fidelity Management & Research, AllianceBernstein, Adobe Ventures, AMD Ventures, Felicis, Premji, Mirae Asset, and Emphatic Capital. The company said the capital will go toward training and scaling its next generation of world models. (techcrunch.com/2026/02/10/ai-…)
UC Berkeley RDI tweet media
English
0
0
0
191
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Anthropic’s $20B Round; Ads in ChatGPT; Seedance 2.0 💰 Anthropic is reportedly in the final stages of raising $20B at a $350B valuation, according to Bloomberg, doubling its original target after strong investor demand. The round is expected to include Altimeter, Sequoia, Lightspeed, Menlo, Coatue, Iconiq, and Singapore’s sovereign wealth fund, with the bulk of capital coming from strategic partners Nvidia and Microsoft. (bloomberg.com/news/articles/…) 📣 OpenAI says it has begun testing ads in ChatGPT in the U.S. for logged-in adult users on the Free and Go tiers, while Plus, Pro, Business, Enterprise, and Education will remain ad-free. Ads are clearly labeled and separated from answers, do not influence responses, and advertisers do not get access to chats or personal data, with only aggregate performance metrics shared. (openai.com/index/testing-…) ⚖️ AI legal startup Harvey is reportedly in talks to raise $200M at an $11B valuation, led by Sequoia Capital and GIC, just months after confirming a $160M raise at an $8B valuation led by Andreessen Horowitz. The company reported reaching a $190M annual recurring revenue run rate by the end of 2025, offering LLM-powered tools used by law firms for research, drafting, and workflow automation (techcrunch.com/2026/02/09/har…) 🎬 ByteDance unveiled its new AI video generation model Seedance 2.0, which early users say can produce multi-shot scenes with synchronized sound effects, music, and dialogue across multiple languages. The model is currently available to a limited set of users via ByteDance’s Jimeng and Jianying apps in China and follows the recent launch of rival video model Kling 3.0 from Kuaishou. (theinformation.com/briefings/byte…) 📚 Amazon has signaled to publishing executives that it is planning a content marketplace where publishers can sell their content to companies building AI products, according to The Information. Slides circulated ahead of an Amazon Web Services conference reference the marketplace alongside core AI tools like Amazon Bedrock, as publishers push for usage-based licensing, following a similar Publisher Content Marketplace effort announced by Microsoft. (theinformation.com/articles/amazo…)
UC Berkeley RDI tweet media
English
0
0
0
231
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Claude Code Fast Mode; Gemini Math Discoveries; Meta/OpenClaw Integration ⚡ Anthropic says that Claude Code now supports Fast Mode. According to the company, users can enable a high-speed configuration of its frontier model, Claude Opus 4.6, that delivers up to ~2.5× output speed while keeping the same intelligence and capabilities, making it especially handy for latency-sensitive workflows and rapid iteration. (code.claude.com/docs/en/fast-m…) 🧠 Perplexity launched Model Council — a multi-model research feature that runs one query across three frontier models and synthesizes a unified answer, highlighting agreements and disagreements to improve reliability. It’s currently available to Max subscribers on the web. (perplexity.ai/hub/blog/intro…) 📐 DeepMind published a case study on semi-autonomous mathematics discovery using Gemini. Using Aletheia, a Gemini Deep Think–based agent, the team evaluated 700 open Erdős conjectures and identified 13 as addressed, including novel solutions, partial solutions, and cases where existing literature had already resolved the problem. (arxiv.org/pdf/2601.22401) 🤖 Meta AI is reportedly preparing Avocado-branded models, a Manus-style browser agent, and OpenClaw integration. Code traces and reports reference Avocado and Avocado Thinking models, scheduled tasks, browser automation features, and compatibility with the open-source autonomous agent OpenClaw. (testingcatalog.com/meta-ai-redies…) 📚 A newly peer-reviewed Nature study evaluated OpenScholar, an open-source scientific question-answering and literature-synthesis model from the Allen Institute. The paper documents how OpenScholar uses retrieval-augmented generation over tens of millions of papers to produce long-form, citation-grounded responses, with code, data, and demos released publicly. (allenai.org/blog/openschol…)
UC Berkeley RDI tweet media
English
1
0
0
239
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Claude Opus 4.6 Launch; GPT-5.3-Codex; Cerebras Series H 🧠 Anthropic launched Claude Opus 4.6, its most advanced model yet, with stronger reasoning, longer context handling, and the ability to delegate “agent teams.” According to the company, the model achieves the highest score on the agentic coding evaluation Terminal-Bench 2.0 and leads all other frontier models on Humanity’s Last Exam. (anthropic.com/news/claude-op…) 💰 Cerebras Systems raised $1 billion in a Series H round at an approximate $23 billion valuation, nearly tripling its valuation within a few months. The round was led by Tiger Global with participation from Benchmark, Fidelity, AMD, and others, and supports the company’s wafer-scale hardware and software platform for large-model training and inference. (bloomberg.com/news/articles/…) 🧑‍💻 OpenAI debuted GPT-5.3-Codex, an upgraded agentic coding model focused on longer tasks, faster performance, and multi-environment deployment (CLI, app, IDE). The release emphasizes stronger coding, cybersecurity, and interactive development workflows, and pushes into general-purpose use. (openai.com/index/introduc…) 🖥️ GitHub integrated Claude and Codex AI agents into its Agent HQ workflow, letting developers choose among Copilot, Claude, or Codex for coding support directly in GitHub, GitHub Mobile, and VS Code. Developers will also be able to judge how Copilot, Claude, and Codex perform, and weigh up how each AI coding agent has generated a solution. (theverge.com/news/873665/gi…) 🎥 Meta is testing a standalone “Vibes” app that spins its AI-generated short-video experience out of the Meta AI app into a dedicated, vertical video platform. The app focuses on creating, remixing, and browsing AI-generated clips via text prompts and effects, and is positioned as a consumer-facing generative video product intended to compete with tools like OpenAI’s Sora. (techcrunch.com/2026/02/05/met…)
UC Berkeley RDI tweet media
English
0
0
0
192
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: OpenAI Frontier; Voxtral Transcribe 2; ElevenLabs $11B Valuation 🚀 OpenAI announced the launch of OpenAI Frontier, a program centered on deploying AI agents–deemed “AI coworkers”–that can automate various work functions. OpenAI reports that companies like Uber, HP, Intuit, and Thermo Fisher are among the first to adopt the platform. (openai.com/index/introduc…) 📝 Mistral released Voxtral Transcribe 2, an updated, open-weight speech-to-text model focused on improved transcription accuracy, faster performance, and expanded support for multiple languages. The model is positioned for use in production and enterprise settings to convert audio into text with lower latency and broader coverage. (mistral.ai/news/voxtral-t…) 🔊 ElevenLabs raised $500 million in a Series D funding round led by Sequoia Capital at an $11 billion valuation. The company develops AI voice and audio technology, including text-to-speech, voice cloning, and conversational agents, and recently released Eleven v3, its new flagship speech model. (techcrunch.com/2026/02/04/ele…) 🧠 Anthropic announced that Claude will remain ad-free, stating that conversations with Claude will not include sponsored links, advertiser influence, or third-party product placements. The company’s opinion is that ads would conflict with Claude’s role as a helpful assistant for work and deep thinking. (anthropic.com/news/claude-is…) 📱 In a recent earnings call, Google reported that its Gemini app has surpassed 750 million monthly active users, with the app serving as a primary interface for Google’s generative AI across Android, Search, Workspace, and standalone experiences. (techcrunch.com/2026/02/04/goo…)
UC Berkeley RDI tweet media
English
0
0
0
123
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: OpenAI Drug Discovery Push; Qwen Launches Agentic Coding Model; Apple Brings Agentic AI to Xcode 🧬 At Cisco Systems’ AI conference in San Francisco, OpenAI CEO Sam Altman said the company may back companies that use AI for drug discovery, including through direct investment or subsidized access to OpenAI models. Altman suggested OpenAI could take royalties on successful therapies in return for better access to frontier AI. (bloomberg.com/news/articles/…) 🤖 Alibaba’s Qwen team released Qwen3-Coder-Next, a model optimized for coding agents and long-context programming tasks. The release, which is built on Qwen3-Next-80B-A3B-Base, emphasizes agentic workflows, tool use, and local deployment. (qwen.ai/blog?id=qwen3-…) 🧑‍💻 Apple updated Xcode with deeper OpenAI and Anthropic integrations, moving its IDE further into agentic coding. The changes allow AI agents to handle multi-step coding tasks, code navigation, and refactoring directly inside Xcode, and may help iOS and MacOS developers speed up development. (techcrunch.com/2026/02/03/xco…) 🧪 Anthropic recently announced partnerships with the Allen Institute and Howard Hughes Medical Institute to embed Claude into scientific research workflows. The partnerships focus on advancing biological discovery through AI-assisted data analysis, synthesis, and interpretation across large-scale research efforts via Anthropic’s models. (anthropic.com/news/anthropic…) 📈 Nvidia-backed UK AI infrastructure company Nscale is said to have hired major investment banks to prepare for a potential IPO. The company is expanding AI-focused data center capacity to meet rising demand from hyperscalers and frontier AI labs, and could go public later this year. (reuters.com/business/nvidi…)
UC Berkeley RDI tweet media
English
0
0
0
161
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: SpaceX-xAI Acquisition; Codex App Launch; Anthropic Coding Research 🚀 Elon Musk said SpaceX has acquired xAI in a deal that combines SpaceX with the AI company behind Grok. Reuters reported the transaction values SpaceX at $1 trillion and xAI at $250 billion and focuses on developing datacenters in space to power the company’s models. (reuters.com/business/musks…) 🧠 OpenAI announced the Codex app for macOS, describing it as an interface to manage multiple agents, run work in parallel, and collaborate on long-running tasks. OpenAI said the app supports separate threads by project, diff review, and worktrees, so agents can work on isolated copies of a repo. In addition, the company is opening up a Windows and Linux waitlist for Codex. (openai.com/index/introduc…) 🧪 Anthropic published research on AI assistance and coding skill development based on a randomized controlled trial with software developers. Participants using AI scored 17% lower on a concept quiz than those who coded without AI, finished slightly faster on average, and showed different outcomes depending on how the AI was used. (anthropic.com/research/AI-as…) 🌐 Cloudflare introduced Moltworker, which it describes as a way to run OpenClaw (formerly Moltbot) on Cloudflare using Workers and related developer platform services. Cloudflare said Moltworker runs agents in isolated sandboxes using Workers and stores state in R2, allowing agents to execute and persist data within Cloudflare’s managed environment rather than on users’ personal machines. (blog.cloudflare.com/moltworker-sel…) 💼 Snowflake and OpenAI announced a multi-year, $200M partnership that makes OpenAI models available inside Snowflake products, including Snowflake Cortex AI and Snowflake Intelligence. Snowflake said customers can use OpenAI models in Snowflake to build agents and applications grounded in enterprise data and to query data in natural language without writing code. (openai.com/index/snowflak…)
UC Berkeley RDI tweet media
English
0
0
0
220
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Claude Cowork Plugins; Jensen Huang on OpenAI; Moltbook Goes Viral 📊 OpenAI staffmembers shared details about an internal in-house data agent used by teams across the organization to query, analyze, and synthesize information from internal datasets. The system shows how agentic workflows are already being applied to operational and analytical tasks at scale to help sift through over 600 petabytes of data. (openai.com/index/inside-o…) 🧩 Anthropic introduced agentic plugins for Claude within its Cowork product, allowing users to agentically automate specific tasks across various functions. The company says you can use plug-ins to “tell Claude how you like work done, which tools and data to pull from, how to handle critical workflows, and what slash commands to expose so your team gets more consistent outcomes.” (techcrunch.com/2026/01/30/ant…) 🤖 Moltbook, a newly launched social network for AI agents, restricts posting and interaction to bots while allowing humans to observe. The project builds on the OpenClaw agent ecosystem and has sparked discussion among AI researchers and developers over the past several days. (nytimes.com/2026/02/02/tec…) 🔁 Google is preparing new additions for Gemini, including a beta “import AI chats” feature that allows users to upload exported conversations from other AI platforms and preserve prior context. Imported chats are stored in the user’s activity and may contribute to model training, while other early features, such as a “Likeness” setting tied to video verification, are also in early testing. (testingcatalog.com/google-will-ma…) 💰 Nvidia CEO Jensen Huang said the company plans to make a “huge” investment in OpenAI soon, which could potentially be Nvidia’s largest ever. This comes after reports that OpenAI is looking to raise a new funding round, prior to its reported Q4 IPO target. (reuters.com/world/asia-pac…)
UC Berkeley RDI tweet media
English
0
0
0
183
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Google DeepMind’s Project Genie; OpenAI IPO Race; Perplexity-Microsoft Deal 🌐 Google DeepMind publicly released Project Genie, an AI-powered world model that lets users generate and explore interactive environments in real time with newfound consistency. Unlike static 3D environments, Genie 3 generates the world ahead in real time as users move and interact, simulating physics and dynamic interactions with sufficient consistency to strengthen fields like robotics. (blog.google/innovation-and…) 📈 OpenAI is reportedly planning a Q4 IPO, racing rival Anthropic to be the first major frontier AI startup to go public by year’s end. The company is holding informal talks with Wall Street banks and has expanded its finance leadership ahead of a potential listing. (wsj.com/tech/ai/openai…) ☁️ Perplexity recently signed a $750M cloud deal with Microsoft, securing Azure infrastructure support for a three-year agreement as it diversifies away from Amazon amid cloud provider disputes. The agreement will allow Perplexity to deploy AI models through Microsoft’s Foundry service, including those made by OpenAI, Anthropic, and xAI. (bloomberg.com/news/articles/…) 🧠 OpenAI is set to retire GPT-4o and older models from ChatGPT on Feb 13, as the company shifts focus to newer model families. According to the company, only 0.1% of users still pick GPT-4o each day and have switched to GPT-5.2, but there will be no changes to the OpenAI API regardless. (openai.com/index/retiring…) 🍏 Apple acquired Israeli audio AI startup Q.ai, a strategic move to boost AI-driven audio and interaction tech, in a deal reportedly worth $2B. The acquisition is the company’s largest since Beats and will allow Apple to integrate Q.ai’s audio technology – such as recognizing speech based on facial movements – into future products. (reuters.com/business/apple…)
UC Berkeley RDI tweet media
English
0
0
0
154
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: DeepMind’s AlphaGenome; Google’s Agentic Chrome; Anthropic’s ServiceNow Deal 🧬 DeepMind unveiled AlphaGenome, a powerful AI model that predicts how non-coding DNA regulates gene activity and identifies disease-linked mutations. This model is a follow-up on the popular AlphaFold model, going beyond proteins into genomic dark matter and aiding biological research. (nytimes.com/2026/01/28/sci…) 🌐 Google enhanced Chrome with tighter Gemini integration and new agentic features, putting its AI assistant in the persistent sidebar and adding agentic task support like auto-browse with user consent. In addition, Gemini can compare content across multiple tabs and can assist users based on their data in Google services like Gmail and Google Calendar. (techcrunch.com/2026/01/28/chr…) 💼 ServiceNow struck a multi-year partnership with Anthropic just one week after announcing a separate AI deal with OpenAI, allowing the company to offer multiple frontier models to enterprise customers. Under the agreement, the Claude model suite becomes the preferred default across ServiceNow’s AI-driven workflows and agent builder, while OpenAI’s models allow for what the company’s president calls a “multi-model strategy.” (techcrunch.com/2026/01/28/ser…) 🛍️ In a recent earnings call, Meta CEO Mark Zuckerberg said that 2026 will bring a major AI rollout, including “agentic commerce tools.” Zuckerberg says that the company is set to start shipping new models and products that help users discover and transact with personalized shopping assistance, which comes after OpenAI and Google both announced their own agentic shopping protocols earlier this year. (techcrunch.com/2026/01/28/zuc…) 🛠️ Mistral launched Vibe 2.0, an upgrade to its terminal coding agent powered by the Devstral 2 model family, which helps developers orchestrate workflows with natural-language commands. Vibe 2.0 adds custom sub-agents for targeted coding tasks, multi-choice clarifications for safer execution, slash commands for repeatable workflows, and automatic updates for Mistral’s paid users. (mistral.ai/news/mistral-v…)
UC Berkeley RDI tweet media
English
0
0
0
141
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
We’ve been waiting to announce this… 🎟️ Early-bird tickets for the Agentic AI Summit 2026 are officially live! 📍 UC Berkeley | Aug 1–2 🎟️ $99 Student / $249 Standard (limited) Building on the momentum of our 40,000+ learner Agentic AI MOOC community and the success of the 2025 Summit (2,000+ in-person, 40,000+ virtual attendees), this year’s event will bring together leading researchers, founders, policymakers, and industry experts to explore the future of agentic AI. We’d love to see you there! 👉 luma.com/agentic-ai-sum…
UC Berkeley RDI tweet media
English
0
0
0
108
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Clawdbot/Moltbot’s Popularity; OpenAI Prism Launch; Claude Apps Integrations 🧪 OpenAI introduced Prism, a cloud-based, LaTeX-native writing environment designed for researchers and scientists. Prism uses GPT-5.2 to assist with drafting, editing, citations, equations, and figures, and supports real-time collaboration between multiple authors. Prism is available for ChatGPT personal accounts, with organizational access planned for a later date. (openai.com/index/the-next…) 🧰 Anthropic launched interactive apps within Claude that integrate with third-party workplace tools, including Slack, Canva, Figma, Box, and Clay, with additional integrations planned. The apps allow users to interact with external tools directly inside the Claude interface and are available to paid Claude plans. (techcrunch.com/2026/01/26/ant…) 🔎 Google announced updates to Search that makes Gemini 3 the default model for AI Overviews globally. Users can now ask follow-up questions from an AI Overview and continue the interaction in AI Mode, enabling longer, multi-step queries within Search across desktop and mobile. (blog.google/products-and-p…) 🇪🇺 OpenAI published an update on its overall European strategy, outlining new initiatives focused on workforce development, small and medium-sized businesses, and public-sector collaboration. The announcement includes an AI accelerator program targeting EU SMEs and grant funding for research on wellbeing and youth safety in AI. (openai.com/index/the-next…) 🦞 Clawdbot, now renamed Moltbot, is an open-source personal AI assistant that runs locally and can automate tasks like messaging, email, and scheduling through apps such as WhatsApp, Slack, and iMessage. The project has gained rapid attention as a full “AI employee” due to its local-first design, context-management, and multi-platform use. (techcrunch.com/2026/01/27/eve…)
UC Berkeley RDI tweet media
English
0
0
0
166
UC Berkeley RDI
UC Berkeley RDI@BerkeleyRDI·
Today’s AI News: Microsoft’s Maia 200 inference chip; Amodei on near-term AI governance; Kimi K2.5 release 🧠 Microsoft unveiled a new custom AI inference chip, Maia 200, aimed at improving efficiency and performance for large-scale model deployment. The chip is already being used to develop models internally and delivers 3x the FP4 performance of third-generation Amazon Trainium chips. (techcrunch.com/2026/01/26/mic…) ⚠️ Anthropic CEO Dario Amodei outlined the near-term implications of increasingly powerful AI systems in a new essay, arguing that AI with expert-level capabilities across many domains could be coming in the next 1 to 5 years. Amodei emphasizes the need for stronger governance, institutional readiness, and public transparency as society adapts to the pace and scale of AI advancement. (darioamodei.com/essay/the-adol…) 🏦 Anthropic announced a partnership with the UK government to pilot Claude in the public-sector, which includes helping citizens navigate government services. The initial use case is employment: helping people find new work, access training, and understand what support and resources are available to them. (anthropic.com/news/gov-UK-pa…) 🔍 Yahoo launched Yahoo Scout, a new AI-powered answer engine for web and mobile that aims to compete with ChatGPT and Google. Scout is designed to deliver quick and conversational answers while still guiding users back to the open web and Yahoo’s broader ecosystem. The experience is primarily powered by Anthropic’s Claude models. (axios.com/2026/01/27/yah…) 🤖 China’s Alibaba-backed Moonshot AI released its new open-source multimodal model, Kimi K2.5. Trained on ~15 trillion mixed visual/text tokens, K2.5 can natively handle text, images, and video, and published benchmarks show it matching or outperforming leading models on coding and agentic tasks such as SWE-Bench and VideoMMMU. (bloomberg.com/news/articles/…)
UC Berkeley RDI tweet media
English
0
0
0
135