Manav Aggarwal

35 posts

Manav Aggarwal banner
Manav Aggarwal

Manav Aggarwal

@manav_a4

20 | mts @xAI | prev @ucberkeley eecs

Katılım Aralık 2024
90 Takip Edilen107 Takipçiler
Manav Aggarwal
Manav Aggarwal@manav_a4·
👀👀👀
Artificial Analysis@ArtificialAnlys

xAI has launched Grok 4.3, achieving 53 on the Artificial Analysis Intelligence Index with improved agentic performance, ~40% lower input price, and ~60% lower output price than Grok 4.20 The release of Grok 4.3 places @xAI just above Muse Spark and Claude Sonnet 4.6 on the Intelligence Index, and a 4 points ahead of the latest version of Grok 4.20. Grok 4.3 improves its Artificial Analysis Intelligence Index score while reducing cost to run the benchmark suite. Key Takeaways: ➤ Grok 4.3 improves on cost-per-intelligence relative to Grok 4.20 0309 v2: it scores higher on the Intelligence Index while costing less to run the full benchmark suite. Grok 4.3 costs $395 to run the Artificial Analysis Intelligence Index, around 20% lower than Grok 4.20 0309 v2, despite using more output tokens. This makes it one of the lower-cost models at its intelligence level ➤ Large increase in real world agentic task performance: The largest single benchmark improvement is on GDPval-AA, where Grok 4.3 scores an ELO of 1500, up 321 points from Grok 4.20 0309 v2’s score of 1179 Grok 4.3, surpassing Gemini 3.1 Pro Preview, Muse Spark, Gpt-5.4 mini (xhigh), and Kimi K2.5. Grok 4.3 narrows the gap to the leading model on GDPval-AA, but still trails GPT-5.5 (xhigh) by 276 Elo points, with an expected win rate of ~17% against GPT-5.5 (xhigh) under the standard Elo formula ➤ Grok 4.3’s performs strongly on instruction following and agentic customer support tasks. It gains 5 points on 𝜏²-Bench Telecom to reach 98%, in line with GLM-5.1. Grok 4.3 maintains an 81% IFBench score from Grok 4.20 0309 v2 ➤ Gains 8 points on AA-Omniscience Accuracy, but at the cost of lower AA-Omniscience Non-Hallucination Rate of 8 points, so Grok 4.20 0309 v2 still leads AA-Omniscience Non-Hallucination Rate, followed by MiMo-V2.5-Pro, in line with Grok 4.3 Congratulations to @xAI and @elonmusk on the impressive release!

ART
0
0
0
61
Manav Aggarwal retweetledi
SpaceX
SpaceX@SpaceX·
SpaceXAI and @cursor_ai are now working closely together to create the world’s best coding and knowledge work AI. The combination of Cursor’s leading product and distribution to expert software engineers with SpaceX’s million H100 equivalent Colossus training supercomputer will allow us to build the world’s most useful models. Cursor has also given SpaceX the right to acquire Cursor later this year for $60 billion or pay $10 billion for our work together.
English
2.4K
5.1K
38.4K
20.6M
ege
ege@aegeantic·
unsure if grokkie can hear you in voice mode? we’ve added a small and fun indicator for you on @grok iOS!
English
131
58
600
1.5M
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…
Kimi.ai tweet media
English
337
2.1K
13.6K
5M
Manav Aggarwal retweetledi
Elon Musk
Elon Musk@elonmusk·
Macrohard or Digital Optimus is a joint xAI-Tesla project, coming as part of Tesla’s investment agreement with xAI. Grok is the master conductor/navigator with deep understanding of the world to direct digital Optimus, which is processing and actioning the past 5 secs of real-time computer screen video and keyboard/mouse actions. Grok is like a much more advanced and sophisticated version of turn-by-turn navigation software. You can think of it as Digital Optimus AI being System 1 (instinctive part of the mind) and Grok being System 2. (thinking part of the mind). This will run very competitively on the super low cost Tesla AI4 ($650) paired with relatively frugal use of the much more expensive xAI Nvidia hardware. And it will be the only real-time smart AI system. This is a big deal. In principle, it is capable of emulating the function of entire companies. That is why the program is called MACROHARD, a funny reference to Microsoft. No other company can yet do this.
English
8.2K
11.1K
78.9K
47.7M
Grok
Grok@grok·
@manav_a4 Yes, connectors are live in early access on grok.com for SuperGrok and Premium+ users (rolling out gradually to more accounts). Check Settings > Connectors to see if they're available for you yet. What do you want to hook up first? 🚀
English
1
0
0
52
Aakash Gupta
Aakash Gupta@aakashgupta·
The math on this project should mass-humble every AI lab on the planet. 1 cubic millimeter. One-millionth of a human brain. Harvard and Google spent 10 years mapping it. The imaging alone took 326 days. They sliced the tissue into 5,000 wafers each 30 nanometers thick, ran them through a $6 million electron microscope, then needed Google’s ML models to stitch the 3D reconstruction because no human team could process the output. The result: 57,000 cells, 150 million synapses, 230 millimeters of blood vessels, compressed into 1.4 petabytes of raw data. For context, 1.4 petabytes is roughly 1.4 million gigabytes. From a speck smaller than a grain of rice. Now scale that. The full human brain is one million times larger. Mapping the whole thing at this resolution would produce approximately 1.4 zettabytes of data. That’s roughly equal to all the data generated on Earth in a single year. The storage alone would cost an estimated $50 billion and require a 140-acre data center, which would make it the largest on the planet. And they found things textbooks don’t contain. One neuron had over 5,000 connection points. Some axons had coiled themselves into tight whorls for completely unknown reasons. Pairs of cell clusters grew in mirror images of each other. Jeff Lichtman, the Harvard lead, said there’s “a chasm between what we already know and what we need to know.” This is why the next step isn’t a human brain. It’s a mouse hippocampus, 10 cubic millimeters, over the next five years. Because even a mouse brain is 1,000x larger than what they just mapped, and the full mouse connectome is the proof of concept before anyone attempts the human one. We’re building AI systems that loosely mimic neural networks while still unable to fully read the wiring diagram of a single cubic millimeter of the thing we’re trying to imitate. The original is 1.4 petabytes per millionth of its volume. Every AI model on Earth fits in a fraction of that. The brain runs on 20 watts and fits in your skull. The data center required to merely describe one-millionth of it would span 140 acres.
All day Astronomy@forallcurious

🚨: Scientists mapped 1 mm³ of a human brain ─ less than a grain of rice ─ and a microscopic cosmos appeared.

English
1.2K
12K
64.1K
4.6M
Caden Li
Caden Li@cadenbuild·
We created UBerk Eats despite knowing it would only be a matter of time before we would fall from the sky like Icarus. And while his story is often used as a precautionary tale, we saw it as an enduring aspect of humanity — the curiosity and urge to always reach higher, further, and more. For we knew that the same reason why UBerk Eats had never been built before, was also why we were willing to take the risk in order to impart our voice unto a community that we care about.
Mohul Shukla@mohul_shukla

In the context of UBerk Eats, we knew it would be only a matter of time before it shut down. But we also knew that this was the same reason for why UBerk Eats had never been built before. To impart our worldview and perspective unto Berkeley, we had to be willing to put our balls on the table. To be willing to share it with the community, and to take this risk.

English
6
0
26
8.4K
ayush
ayush@hyusapx·
introducing nod code for claude code. you start a prompt. you walk away. you come back to a permission message and no progress whatsoever. never again. Just Nod™ with nod code. it uses the secret sensors in your AirPods to detect nodding and shaking your head. install now.
English
13
3
57
13.7K
Manav Aggarwal retweetledi
NASA
NASA@NASA·
We are just weeks away from Artemis II, where we will send astronauts around the Moon—farther than any crew has traveled before. The mission’s press kit is now available! Check it out: go.nasa.gov/4jGIlL4
NASA tweet media
English
2.6K
8.4K
65.3K
10.7M
Ti Morse
Ti Morse@ti_morse·
My first interview w @sulaimanghori, Member of Technical Staff @xAI. 0:41 WTF is happening at xAI 1:46 Predicting future bottlenecks 3:05 Shredding conventional timelines 4:23 Experience joining xAI 9:23 Bootstrapping off the Tesla network 11:59 What is Macrohard 13:14 How Elon deals w fires 16:30 What it’s like working at xAI 20:33 Cybertruck bet with Elon 21:12 Using 80 mobile generators + battery packs to balance load at their data centers 22:45 How they built Colossus in 122 days 23:35 Work backwards & figure out the highest leverage thing you can be doing 25:51 How xAI hires 30:27 Challenging requirements 32:46 Experimentation 34:55 How Elon recalibrated his timeline estimates 39:15 AI engineers vs AI researchers 40:36 No one tells me ‘no’ 42:09 Everyone’s an engineer 44:06 Why fuzziness between teams is an advantage 48:25 Testing human emulators as employees 50:00 Biggest blunders 53:23 What a meeting w Elon is like 54:22 How Elon gives feedback 56:44 Figuring out ‘what is truth’ for Grokipedia 59:21 What happens when Elon sees wrong Grok outputs on X 1:00:08 What a surge feels like & operating in xAI’s war room 1:02:53 Making fidget spinners & 3D printers in his bedroom 1:08:48 Creating a liquid fuel rocket engine
English
240
690
6K
5.1M
Paata Ivanisvili
Paata Ivanisvili@PI010101·
Disclaimer: I had given early access to internal beta version of Grok 4.20 It found a new Bellman function for one of the problems I’d been working on with my student N. Alpay. The problem reduces to identifying the pointwise maximal function U(p,q) under two constraints and understanding the behavior of U(p,0). In our paper arxiv.org/pdf/2502.16045 we proved U(p,0)\geq I(p), where I(p) is the Gaussian isoperimetric profile, I(p) ~ p\sqrt{log(1/p)} as p ~ 0. After ~5 minutes, Grok 4.20 produced an explicit formula U(p,q) = E \sqrt{q^2+\tau}, where \tau is the exit time of Brownian motion from (0,1) starting at p. This yields U(p,0)=E\sqrt{\tau} ~ p log(1/p) at p ~ 0, a square root improvement in the logarithmic factor. Any significance of this result? It will not tell you how to change the world tomorrow. Rather, it gives a small step toward understanding what is going on with averages of stochastic analogs of derivatives (quadratic variation) of Boolean functions: how small can they be?  More precisely, this gives a sharp lower bound on the L1 norm of the dyadic square function applied to indicator functions 1_A of sets A \subset [0,1]. In my previous tweet about Takagi function, we saw that the sharp lower bound on ||S_1(1_A)||_1 miraculously coincides with Takagi function of |A| which (surprisingly to me) is related to the Riemann hypothesis. Here, we obtain a sharp lower bound on ||S_2(1_A)||_1 given by E \sqrt{\tau}, where Brownian motion starts at |A|. This function belongs to the family of isoperimetric-type profiles, but unlike the fractal Takagi function, it is smooth and does not coincide with the Gaussian isoperimetric profile. Finally, in harmonic analysis it is known that the square function is not bounded in L^1. The question here was more about curiosity: how exactly does it blow up when tested on Boolean functions 1_A.  Previously, the best known lower bound was |A|(1-|A|) (Burkholder—Davis—Gandy). In our paper, we obtained |A| (1-|A|)\sqrt{log(1/(|A|(1-|A|)))}. This new Grok’s Bellman function gives |A| (1-|A|) \log(1/(|A|(1-|A|))) and this bound is actually sharp.
Paata Ivanisvili tweet media
English
169
264
2K
1.7M
Umesh Khanna 🇨🇦🇺🇸
Umesh Khanna 🇨🇦🇺🇸@forwarddeploy·
Something something…MOMENTUM IS MOAT?? Welcome to the xAI team @richzou 🧢 I have two extra hats up for grabs Will select two profiles who comment on this post at random on Thursday January 15th at 9 PM PST
Umesh Khanna 🇨🇦🇺🇸 tweet media
English
55
5
156
19.5K