Elon X Chat ✪
908 posts

Elon X Chat ✪
@rizkisiddiq
🚀Spacex • CEO • CTO 🚘| Tesla • CEO and Product architect 🚄I Hyperloop • Founder 🧩| OpenAl • Co- founder👇



𝕏 - ✅ open source algorithm Youtube ❌ Facebook ❌ Instagram ❌ TikTok ❌ Reddit ❌ Threads ❌ Why do other social networks not make their algorithms open-source?

This is how the algorithm can completely destroy your reach over night. This is the last: Left: 3 months Right: 2 weeks Super consistent 85-95% drop on all metrics. everything after a viral post going ballistic, I tried everything, cool down, delete low quality posts, block bot accounts. Kept posting after cool down, nothing really breaks through. Short hot takes 🛑 Long form with good signal 🛑 Viral potential post 🛑 Core audience value post 🛑 What bothers me here is that 48h after posting a mega viral post I get suppressed back to the Stone Age. This follow previous situations I’ve had with the grok powered algorithm. Where it feels like tweepCred falls far below a certain level, and you’re locked into a low reach prison with every effort to break out is making it harder and harder to do so. I’m asking for transparency on what we can do as content creators when this happens. I don’t want to spam my way out of this. I’d like to know, if I did something wrong, how I can address it, take the responsibility of algorithmic suppression for what ever the length is. But this limbo is most likely going to make me leave the platform.


#2 across all new releases in Canada.

An early beta of Grok Build, an agentic CLI for coding, building apps, and automating workflows is now available for SuperGrok Heavy subscribers. Through this early beta, we will improve the model and product based on your feedback. Try it at x.ai/cli




Starship’s twelfth flight test will debut the next generation Starship and Super Heavy vehicles, powered by the next evolution of the Raptor engine and launching from a newly designed pad at Starbase. The launch is targeted as early as Tuesday, May 19 → spacex.com/launches/stars…


Announcing agentic performance benchmarking for Speech to Speech models on Artificial Analysis. We use 𝜏-Voice to measure tool calling and customer interaction voice agent capabilities in realistic customer service scenarios Even the strongest Speech to Speech (S2S) models today resolve only about half of realistic customer service scenarios end-to-end - a meaningful gap relative to frontier text-based agents on the same tasks. Voice channels introduce significant complexity: challenging accents, background noise, and packet loss, all while requiring fast responses, consistency across long multi-turn conversations, and reliable tool use. Performance also varies considerably by audio condition: in clean audio some models perform notably better, but realistic conditions continue to pose a challenge. Conversation duration also varies meaningfully across models, with implications for both customer experience and operational cost. About 𝜏-Voice: Our Agentic Performance benchmark is based on 𝜏-Voice (Ray, Dhandhania, Barres & Narasimhan, 2026), which extends 𝜏²-bench into the voice modality to evaluate S2S models on realistic customer service tasks. It measures multi-turn instruction following, support of a simulated customer through a complete interaction, and tool use against simulated customer service systems. The simulated user combines an LLM-driven decision model with realistic audio synthesis: diverse accents, background noise, and packet loss modelled on real network conditions. This complements our Big Bench Audio benchmark measuring intelligence and Conversational Dynamics (Full Duplex Bench subset) benchmark measuring conversational naturalness. Scores are the average of three independent pass@1 trials. We evaluate under realistic audio conditions using the 𝜏²-bench base task split across three domains: ➤ Airline (50 scenarios): e.g., changing a flight, rebooking under policy constraints ➤ Retail (114 scenarios): e.g., disputing a charge, processing a return ➤ Telecom (114 scenarios): e.g., resolving a billing issue, troubleshooting a service problem Task success is determined by deterministic checks against expected actions and final database state, consistent with the 𝜏²-bench evaluator. Key results: xAI's Grok Voice Think Fast 1.0 is the clear leader at 52.1%, averaging 5.6 minutes per conversation, the second-longest overall. OpenAI's GPT-Realtime-2 (High) (39.8%, 3.0 min) and GPT-Realtime-1.5 (38.8%, 4.8 min) follow, with Gemini 3.1 Flash Live Preview - High close behind at 37.7% (3.8 min). Speech to Speech is a fast evolving modality and we expect movement in rankings as we continue to add new models with these capabilities, and model robustness improves. Congratulations @xAI @elonmusk! See below for further detail ⬇️