Arturoo

749 posts

Arturoo

Arturoo

@MrPlombaa

Katılım Ağustos 2024
39 Takip Edilen77 Takipçiler
Arturoo retweetledi
Zach
Zach@CryptoZachLA·
Solid stream today. Great turn out and love and support. Appreciate it. Next stream is next Tuesday 🫶
English
11
4
53
3.2K
ALX 🇺🇸
ALX 🇺🇸@alx·
In retrospect, it was Inevitable “Let that sink in” —@ElonMusk
ALX 🇺🇸 tweet media
English
4.6K
5.7K
57.5K
9.8M
ALX 🇺🇸
ALX 🇺🇸@alx·
When you’re somewhere important but just posted a banger and want to see how it’s performing 🤣
ALX 🇺🇸 tweet media
English
158
414
6.2K
166.7K
Bryan Callen
Bryan Callen@bryancallen·
Not sure what’s going on in NYC but it looks like hundreds of people are closing in on the fights wearing Dana White masks?
English
111
64
1.9K
1.3M
Elon Musk
Elon Musk@elonmusk·
Grok Voice is #1!
Artificial Analysis@ArtificialAnlys

Announcing agentic performance benchmarking for Speech to Speech models on Artificial Analysis. We use 𝜏-Voice to measure tool calling and customer interaction voice agent capabilities in realistic customer service scenarios Even the strongest Speech to Speech (S2S) models today resolve only about half of realistic customer service scenarios end-to-end - a meaningful gap relative to frontier text-based agents on the same tasks. Voice channels introduce significant complexity: challenging accents, background noise, and packet loss, all while requiring fast responses, consistency across long multi-turn conversations, and reliable tool use. Performance also varies considerably by audio condition: in clean audio some models perform notably better, but realistic conditions continue to pose a challenge. Conversation duration also varies meaningfully across models, with implications for both customer experience and operational cost. About 𝜏-Voice: Our Agentic Performance benchmark is based on 𝜏-Voice (Ray, Dhandhania, Barres & Narasimhan, 2026), which extends 𝜏²-bench into the voice modality to evaluate S2S models on realistic customer service tasks. It measures multi-turn instruction following, support of a simulated customer through a complete interaction, and tool use against simulated customer service systems. The simulated user combines an LLM-driven decision model with realistic audio synthesis: diverse accents, background noise, and packet loss modelled on real network conditions. This complements our Big Bench Audio benchmark measuring intelligence and Conversational Dynamics (Full Duplex Bench subset) benchmark measuring conversational naturalness. Scores are the average of three independent pass@1 trials. We evaluate under realistic audio conditions using the 𝜏²-bench base task split across three domains: ➤ Airline (50 scenarios): e.g., changing a flight, rebooking under policy constraints ➤ Retail (114 scenarios): e.g., disputing a charge, processing a return ➤ Telecom (114 scenarios): e.g., resolving a billing issue, troubleshooting a service problem Task success is determined by deterministic checks against expected actions and final database state, consistent with the 𝜏²-bench evaluator. Key results: xAI's Grok Voice Think Fast 1.0 is the clear leader at 52.1%, averaging 5.6 minutes per conversation, the second-longest overall. OpenAI's GPT-Realtime-2 (High) (39.8%, 3.0 min) and GPT-Realtime-1.5 (38.8%, 4.8 min) follow, with Gemini 3.1 Flash Live Preview - High close behind at 37.7% (3.8 min). Speech to Speech is a fast evolving modality and we expect movement in rankings as we continue to add new models with these capabilities, and model robustness improves. Congratulations @xAI @elonmusk! See below for further detail ⬇️

English
2.4K
5.6K
25.4K
8.4M
Joe Rogan Podcast News
Joe Rogan Podcast News@joeroganhq·
Elon Musk: "The reason we are seeing this extreme amount of hatred and violence is because we are actually succeeding in getting rid of corruption and waste. If we weren't succeeding in getting rid of corruption and waste they wouldn't care."
English
124
856
6.2K
63.7K
Arturoo retweetledi
John
John@CryptoGodJohn·
$MAGA reclaiming significant levels Seeing huge wallets accumulate - some of the biggest names & voices in the world all starting to join the MAGA army This is bringing back 2021 community coin vibes Alien meta has just started
John tweet media
English
43
45
226
13.2K
Make Aliens Great Again
Make Aliens Great Again@MAGA_Aliens·
🚨 JUST IN: The Pentagon has reportedly released new Alien/UFO footage — this time captured over the Middle East. What are they hiding from us?
English
21
16
115
6.8K