Jacob Loewenstein

1.5K posts

Jacob Loewenstein banner
Jacob Loewenstein

Jacob Loewenstein

@spatialjlo

ai product partnerships @nvidia | prev head of partnerships @GroqInc | 🎤 sings macy gray at karaoke

New York, NY Katılım Mart 2015
461 Takip Edilen1.7K Takipçiler
Jacob Loewenstein
Jacob Loewenstein@spatialjlo·
welcome back to the arena Meta! ⚔️
Artificial Analysis@ArtificialAnlys

Meta is back! Muse Spark scores 52 on the Artificial Analysis Intelligence Index, behind only Gemini 3.1 Pro, GPT-5.4, and Claude Opus 4.6. Muse Spark is the first new release since Llama 4 in April 2025 and also Meta's first release that is not open weights Muse Spark is a new model from @Meta evaluated on Artificial Analysis. We were given early access by Meta to independently benchmark the model. It is the first frontier-class model from Meta since Llama 4 Maverick was released in April 2025, and notably the first @AIatMeta model that is not being released as open weights. The release follows Meta's reorganization of its AI efforts under Meta Superintelligence Labs, and signals that Meta is re-entering the frontier race after roughly a year of relative quiet. For context, Llama 4 Maverick and Scout scored 18 and 13 respectively on the Artificial Analysis Intelligence Index as non-reasoning models at the time of their release, while Muse Spark scores 52. Muse Spark essentially closes the gap between to the frontier in a single release. The model is not open source and is not yet accessible via an API but Meta has shared they expect this to come soon. Meta is also integrating Muse Spark into their first party products including their Meta AI chat product, Facebook, Instagram and Threads. Key takeaways from our benchmarks: ➤ Muse Spark scores 52 on the Artificial Analysis Intelligence Index, placing it within the top 5 models we have benchmarked. It sits ahead of Claude Sonnet 4.6, GLM-5.1, MiniMax-M2.7, Grok 4.20 and behind Gemini 3.1 Pro Preview, GPT-5.4 and Claude Opus 4.6 ➤ Muse Spark is notably token efficient for its intelligence level. It used 58M output tokens to run the Intelligence Index, comparable to Gemini 3.1 Pro Preview (57M) and notably lower than Claude Opus 4.6 (Adaptive Reasoning, max effort, 157M), GPT-5.4 (xhigh, 120M) and GLM-5 (110M) ➤ Muse Spark is the second-most capable vision model we have benchmarked. It scores 80.5% on MMMU-Pro, behind only Gemini 3.1 Pro Preview (82.4%) ➤ Muse Spark performs strongly on reasoning and instruction-following evaluations. It scores 39.9% on HLE, trailing only Gemini 3.1 Pro Preview (44.7%) and GPT-5.4 (xhigh, 41.6%). The model also achieved 5th highest in CritPT with a score of 11%, an eval that is focused on difficult physics research questions. This is substantially above above Gemini 3 Flash (9%) and Claude 4.6 Sonnet (3%) ➤ Agentic performance does not stand out. On GDPval-AA, our evalaution focused on real world work tasks, Muse Spark scores 1427, behind both Claude Sonnet 4.6 at 1648 and GPT-5.4 at 1676, but ahead of Gemini 3.1 Pro Preview at 1320. On On TerminalBench Hard, Muse Spark trails Claude Sonnet 4.6, GPT-5.4, and Gemini 3.1 Pro. Muse Spark joins others in achieving a high τ²-Bench Telecom score of 92% Key model details: ➤ Modalities: Multimodal including text and vision input, text output ➤ License: Proprietary, Meta's first frontier model not released as open weights ➤ Availability: No public API at the time of publishing. Meta expects to provide API access soon. Meta has started integration into their first party AI offering Meta AI and inside Facebook, Instagram, and Threads

English
0
0
1
46
Jacob Loewenstein
Jacob Loewenstein@spatialjlo·
this is great. love @AIatMeta staying the course on OSS 💪
Ina Fried@inafried

New @axios Scoop: Meta will open source versions of new models set to be released soon - the first under @alexandr_wang. But open versions won’t be right at launch. Meta wants to remove some proprietary elements and address potential safety risks

English
0
0
0
84
Jacob Loewenstein
Jacob Loewenstein@spatialjlo·
this is so dope
anirudh bv@anirudhbv_ce

I implemented @GoogleResearch's TurboQuant as a CUDA-native compression engine on Blackwell B200. 5x KV cache compression on Qwen 2.5-1.5B, near-loseless attention scores, generating live from compressed memory. 5 custom cuTile CUDA kernels ft: - fused attention (with QJL corrections) - online softmax -on-chip cache decompression - pipelined TMA loads Try it out: devtechjr.github.io/turboquant_cut… s/o @blelbach and the cuTile team at @nvidia for lending me Blackwell GPU access :) cc @sundeep @GavinSherry

English
1
0
11
1.8K
Gavin
Gavin@GavinSherry·
The only way to decipher CAPTCHAs these days is with AI
Gavin tweet media
English
1
0
4
650
kraken
kraken@kraken_9076·
Artemis 2 launch over Teams: "T minus 10...9..." "Mission control you're muted ...Do you have the right input selected ...Still cant hear you"
English
1
0
1
196
sunny madra
sunny madra@sundeep·
Excited to be sharing insights with the @Stanford MS&E 435 class this semester
Apoorv Agrawal@apoorv03

Thrilled to be back on Stanford campus today kicking off MS&E 435: Economics of the AI Supercycle, a seminar unpacking the economics across every layer of the AI stack. Incredibly grateful to an amazing group of speakers dedicating their time for the community: @altcap (Altimeter) @alighodsi (Databricks) @rauchg (Vercel) @sundeep (Groq / NVIDIA) @ChaseLochmiller (Crusoe) @sk7037 (OpenAI) @tuhinone (Baseten) @ypatil125 (Applied Compute) Eric KA (Anthropic) 9 weeks. 9 speakers. 1 question: Where does value accrue in this new supercycle? Join us!

English
4
4
55
3.9K
Jacob Loewenstein retweetledi
David J Phillips
David J Phillips@davj·
"Make no mistakes DO NOT HALLUCINATE. YOU ARE AN EXPERT SOFTWARE ENGINEER"
English
191
2.1K
24.4K
1.3M
Jacob Loewenstein retweetledi
TBPN
TBPN@tbpn·
"Nvidia is positioned perfectly to thrive on the coding agent wave" and the explosion in inference demand, says @firstadopter. "I met with Ian Buck and dozens of engineers at Meta, Google, and Nvidia. All of them are seeing crazy inference demand and AI compute shortages." "People are building bots to pick up any kind of B200 GPUs they can find, but they're waiting weeks or months." "Jensen is very prescient. He probably saw this demand months away. He locked up all the supply agreements for memory, CoWoS, and connectors ahead of time to take advantage of the coding-assistant boom." "It's almost like a gold rush. You see OpenAI pivoting toward it, Anthropic is obviously thriving on it. Billions of ARR every few weeks." "Jensen acquired Groq's assets and people. And the combination of integrating Groq's technology with Vera Rubin lets Nvidia serve this tremendous wave of compute demand economically."
English
8
18
159
33K
Jacob Loewenstein
Jacob Loewenstein@spatialjlo·
in this episode kramer tries turning his apartment into a datacenter ⚡️
Ostris@ostrisai

I trained this @ltx_model LTX 2.3 LoRA of George Costanza at home on my 5090 in about a day with AI Toolkit. I generated this 30 second video with @ComfyUI on my 5090 in 6 minutes. Open source is, always has been, and always will be, the future of generative AI. (SOUND ON)

English
0
0
3
138
sunny madra
sunny madra@sundeep·
Coding 2026
GIF
English
6
1
24
1.2K
Greg Burnham
Greg Burnham@GregHBurnham·
If there's ever a robotics bouldering demo, I'm going to make a joke about AI hitting a wall
English
1
0
2
369