RunAnywhere (YC W26)

2

28

9.5K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·4d

At @RunAnywhereAI we just extended MetalRT with S2S support: beating @Apple at their own game once again and delivering the FASTEST Speech-to-Speech engine on Apple Silicon right now, the ONLY truly multimodal inference provider on the market. - 1.68s best latency - 1.52x faster than mlx-audio - 123 tok/s generate We crushed mlx-audio across short, medium, and long audio clips on @liquidai LFM2.5-Audio-1.5B 8-bit quantized on a single M4 Max. Multimodal inference just hit warp speed, full voice-video-text fusion coming soon. #ycombinator #runanywhere #ondeviceai #applesilicon #metalrt #S2S

At @RunAnywhereAI we just extended MetalRT with 👀 support: beating @Apple at their own game once AGAIN and delivering the FASTEST VLM decode engine on the market for Apple Silicon right now. - 279 tok/s vision decode - 1.22× faster than mlx-vlm We crushed mlx-vlm and llama.cpp across every configuration tested on Qwen3-VL-2B-Instruct 4-bit quantized across multiple image resolutions on a single M4 Max. Vision decode just hit warp speed! Video analysis coming soon :) #ycombinator #runanywhere #metalrt #applesilicon #vlm #ondeviceai

English

16

967

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·5d

At @RunAnywhereAI we just made VLM analysis warp-speed easy with MetalRT in RCLI. Grabbed a live $NVDA chart from the web, screenshot, and boom: Qwen3-VL-2B crushes the breakdown on my M4 Max in seconds. Trend spotting, levels, buy signals, all on-device. Vision decode at 279 tok/s changes everything. #ycombinator #runanywhere #ondeviceai #applesilicon #vlm #metalrt

At @RunAnywhereAI we just extended MetalRT with 👀 support: beating @Apple at their own game once AGAIN and delivering the FASTEST VLM decode engine on the market for Apple Silicon right now. - 279 tok/s vision decode - 1.22× faster than mlx-vlm We crushed mlx-vlm and llama.cpp across every configuration tested on Qwen3-VL-2B-Instruct 4-bit quantized across multiple image resolutions on a single M4 Max. Vision decode just hit warp speed! Video analysis coming soon :) #ycombinator #runanywhere #metalrt #applesilicon #vlm #ondeviceai

English

2

3

11

926

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·6d

At @RunAnywhereAI we just extended MetalRT with 👀 support: beating @Apple at their own game once AGAIN and delivering the FASTEST VLM decode engine on the market for Apple Silicon right now. - 279 tok/s vision decode - 1.22× faster than mlx-vlm We crushed mlx-vlm and llama.cpp across every configuration tested on Qwen3-VL-2B-Instruct 4-bit quantized across multiple image resolutions on a single M4 Max. Vision decode just hit warp speed! Video analysis coming soon :) #ycombinator #runanywhere #metalrt #applesilicon #vlm #ondeviceai

In just 48 hours at @RunAnywhereAI we built MetalRT: beating @Apple at their own game and delivering the FASTEST LLM inference engine on the market for Apple Silicon right now. - 570 tok/s decode @liquidai LFM 2.5-1.2B 4-bit - 658 tok/s decode @Alibaba_Qwen Qwen3-0.6B, 4-bit - 6.6 ms time-to-first-token - 1.19× faster than Apple's own MLX (identical model files) - 1.67× faster than llama.cpp on average We crushed Apple MLX, llama.cpp, uzu(by TryMirai), and Ollama across four different 4-bit models, including the on-device optimized LFM2.5-1.2B on a single M4 Max. Excited for this one! #ycombinator #runanywhere #ondeviceai #applesilicon #mlx

English

5

8

36

5.8K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·13 Mar

VLM support just added to RCLI. Your local Mac AI can now see images. Drag and drop any image or press V and it analyzes instantly. All on-device with llama.cpp right now. In the demo video I pointed the camera at my phone showing a photo of @ShubhamMal72313 and me at @ycombinator. Qwen3-VL-2B gave a very accurate description. We're bringing vision support back to MetalRT which will once again be the FASTEST visual language model on Apple Silicon :) Faster than human eyes can perceive and react. PR in comments. All local. All open source. All free. #ycombinator #runanywhere #metalrt #vlm

just shipped Personalities on RCLI: your local voice AI can now be professional, sarcastic, cynical, or a full-blown nerd who references Star Wars in every answer siri could never have a personality crisis running entirely on your GPU all of it still powered by our fastest inference engine: MetalRT by @RunAnywhereAI , check it out! #runanywhere #ondeviceai #sirikiller #metalRT

English

2

3

571

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·12 Mar

just shipped Personalities on RCLI: your local voice AI can now be professional, sarcastic, cynical, or a full-blown nerd who references Star Wars in every answer siri could never have a personality crisis running entirely on your GPU all of it still powered by our fastest inference engine: MetalRT by @RunAnywhereAI , check it out! #runanywhere #ondeviceai #sirikiller #metalRT

Launching RCLI + MetalRT: the FASTEST on-device voice AI for macOS, crushing Apple MLX, MLX Whisper, MLX Audio, llama.cpp, sherpa-onnx, Ollama, uzu & every inference engine out there, built by @RunAnywhereAI - 658 tok/s LLM decode (1.19x > MLX) - 714x real-time STT (4.6x > MLX Whisper) - 8.8x RTF TTS (2.8x > MLX Audio) - 63ms voice-to-audio latency Hybrid RAG: <4ms retrieval (HNSW+BM25+RRF), sub-200ms w/ embedding cache 36 macOS actions: open apps, web search, control Spotify - all local, offline, open-source - accepting PRs for more custom actions. No API keys. Stay tuned, more updates incoming, about to hit WARP SPEED!! #ycombinator #runanywhere #applesilicon #metalRT

English

2

3

17

2.5K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·12 Mar

This has been obvious for a while: local models would ignite hardware competition and finally democratize frontier-level AI. We didn't wait to see it play out. That's the whole reason @RunAnywhereAI exists: seamless, high-performance local inference on any hardware today.

@jason@Jason

The top .1% of users are playing with local LLMS This will 10x every 12 months.... Until Apple, Dell and MSFT have one on your local device BY DEFAULT in 2028 This takes the local hardware spec race from irrelevant for 99% of users to critical. Going to be insane.

English

1

10

878

RunAnywhere (YC W26) retweetet

Shubham Malhotra@ShubhamMal72313·11 Mar

RCLI with MetalRT + llamacpp #runanywhere #yc #metalRT #apple

So close to fulfilling the prophecy!

English

2

7

727

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·11 Mar

So close to fulfilling the prophecy!

English

2

7

1.7K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·10 Mar

Launching RCLI + MetalRT: the FASTEST on-device voice AI for macOS, crushing Apple MLX, MLX Whisper, MLX Audio, llama.cpp, sherpa-onnx, Ollama, uzu & every inference engine out there, built by @RunAnywhereAI - 658 tok/s LLM decode (1.19x > MLX) - 714x real-time STT (4.6x > MLX Whisper) - 8.8x RTF TTS (2.8x > MLX Audio) - 63ms voice-to-audio latency Hybrid RAG: <4ms retrieval (HNSW+BM25+RRF), sub-200ms w/ embedding cache 36 macOS actions: open apps, web search, control Spotify - all local, offline, open-source - accepting PRs for more custom actions. No API keys. Stay tuned, more updates incoming, about to hit WARP SPEED!! #ycombinator #runanywhere #applesilicon #metalRT

We built the future of voice AI on your Mac. RCLI is here @RunAnywhereAI! Our optimized end-to-end voice + RAG pipeline: talk → instant control + doc answers, ~131ms latency, - all LOCAL - all OPEN SOURCE - all FREE. 43 actions, no cloud, your data forever private. Siri: “Let me think about that…” RCLI: 131 ms voice-to-action. Done. Next. Experience it—install & command your machine: curl -fsSL raw.githubusercontent.com/RunanywhereAI/… | bash Next level incoming: MetalRT support (fastest Apple Silicon inference 658 tok/s decode, blazing ASR and TTS). Your Mac's about to hit warp speed! #OnDevice #MetalRT #YCW26 #NoMoreWaiting

English

12

20

146

16.7K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·9 Mar

MetalRT just became the first complete AI inference engine for Apple Silicon: LLM + STT + TTS by @RunAnywhereAI. We already had the fastest LLM decode (658 tok/s). Now we've crushed STT and TTS too, beating MLX across the board. Today's numbers on M4 Max: - 1-hour podcast transcribed in ~5 seconds - 3-hour meeting transcribed in ~15 seconds - Live captioning with zero perceptible delay - 714x faster than real-time for STT - 4.6x faster than Apple's MLX on speech-to-text All three modalities. One unified engine. And this is just the individual components. The full voice AI pipeline we're building on top will be the FASTEST ever on Apple Silicon. Launching soon. Full benchmarks, charts, and details in the comments. #AppleSilicon #OnDeviceAI #MetalRT #STT #TTS #VoiceAI

In just 48 hours at @RunAnywhereAI we built MetalRT: beating @Apple at their own game and delivering the FASTEST LLM inference engine on the market for Apple Silicon right now. - 570 tok/s decode @liquidai LFM 2.5-1.2B 4-bit - 658 tok/s decode @Alibaba_Qwen Qwen3-0.6B, 4-bit - 6.6 ms time-to-first-token - 1.19× faster than Apple's own MLX (identical model files) - 1.67× faster than llama.cpp on average We crushed Apple MLX, llama.cpp, uzu(by TryMirai), and Ollama across four different 4-bit models, including the on-device optimized LFM2.5-1.2B on a single M4 Max. Excited for this one! #ycombinator #runanywhere #ondeviceai #applesilicon #mlx

English

9

15

103

12K

RunAnywhere (YC W26) retweetet

Shubham Malhotra@ShubhamMal72313·8 Mar

Since MetalRT support is coming soon by @RunAnywhereAI (@ycombinator, W'26) In the meantime, here's what's already running fully on-device in RCLI: full voice-to-action + RAG on my own documents. No cloud, no data leaving the machine. Your files stay yours. Running @liquidai LFM models + Piper TTS locally - blazing fast voice. #MetalRT #LocalRAG #YCW26 #OnDeviceAI #runanywhere

We built the future of voice AI on your Mac. RCLI is here @RunAnywhereAI! Our optimized end-to-end voice + RAG pipeline: talk → instant control + doc answers, ~131ms latency, - all LOCAL - all OPEN SOURCE - all FREE. 43 actions, no cloud, your data forever private. Siri: “Let me think about that…” RCLI: 131 ms voice-to-action. Done. Next. Experience it—install & command your machine: curl -fsSL raw.githubusercontent.com/RunanywhereAI/… | bash Next level incoming: MetalRT support (fastest Apple Silicon inference 658 tok/s decode, blazing ASR and TTS). Your Mac's about to hit warp speed! #OnDevice #MetalRT #YCW26 #NoMoreWaiting

English

4

13

1.7K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·8 Mar

We built the future of voice AI on your Mac. RCLI is here @RunAnywhereAI! Our optimized end-to-end voice + RAG pipeline: talk → instant control + doc answers, ~131ms latency, - all LOCAL - all OPEN SOURCE - all FREE. 43 actions, no cloud, your data forever private. Siri: “Let me think about that…” RCLI: 131 ms voice-to-action. Done. Next. Experience it—install & command your machine: curl -fsSL raw.githubusercontent.com/RunanywhereAI/… | bash Next level incoming: MetalRT support (fastest Apple Silicon inference 658 tok/s decode, blazing ASR and TTS). Your Mac's about to hit warp speed! #OnDevice #MetalRT #YCW26 #NoMoreWaiting

GIF

Erick@ErickSky

Esta startup es una seria amenaza para Siri ☠️ Se llama @RunAnywhereAI y acaba de soltar RCLI: un asistente de voz 100% local que ya le gana en velocidad y privacidad. 131 ms end-to-end (voz → respuesta hablada) ⭐️Controla 43 acciones nativas de macOS (Spotify, ventanas, FaceTime, recordatorios…). ⭐️RAG instantáneo en tus PDFs y documentos. ⭐️Todo offline, sin nube, sin API keys. Eso no es todo. Lo que se viene... mamita. El founder acaba de revelar MetalRT (su nuevo motor TTS hecho con Metal y lo que ves en el vídeo) que logra 291 ms para 5 palabras y 8.4x más rápido que el tiempo real. Cuando salga esa actualización… Siri va a llorar. Por mientras, REPOOO 👇

English

7

18

127

31.2K

RunAnywhere (YC W26) retweetet

Erick@ErickSky·7 Mar

Esta startup es una seria amenaza para Siri ☠️ Se llama @RunAnywhereAI y acaba de soltar RCLI: un asistente de voz 100% local que ya le gana en velocidad y privacidad. 131 ms end-to-end (voz → respuesta hablada) ⭐️Controla 43 acciones nativas de macOS (Spotify, ventanas, FaceTime, recordatorios…). ⭐️RAG instantáneo en tus PDFs y documentos. ⭐️Todo offline, sin nube, sin API keys. Eso no es todo. Lo que se viene... mamita. El founder acaba de revelar MetalRT (su nuevo motor TTS hecho con Metal y lo que ves en el vídeo) que logra 291 ms para 5 palabras y 8.4x más rápido que el tiempo real. Cuando salga esa actualización… Siri va a llorar. Por mientras, REPOOO 👇

Español

DailyPapers@HuggingPapers

14

87

21.3K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·7 Mar

You can now use @RunAnywhereAI and switch adapters on your base model and change behavior without loading the entire model again!

Qwen just dropped a 2B multimodal base model with 1M token context on Hugging Face Native 262k context, extensible to 1M tokens for long documents

English

2

8

1.1K

RunAnywhere (YC W26)@RunAnywhereAI·7 Mar

@stellon_labs 😺 🫱🏻‍🫲🏽 @RunAnywhereAI soon

Rohan Joshi@ron_joshi

@sanchitmonga22 @ShubhamMal72313 If you wanna go faster you can always use kittentts v0.8 ;)

English

62

RunAnywhere (YC W26) retweetet

Shubham Malhotra@ShubhamMal72313·7 Mar

On-device AI is going mainstream. 🚀 Good to see more conversations around why the next wave of AI will run on your phone. A big part of what we’re building at @RunAnywhereAI (@ycombinator, W’26). #runanywhere #yc #ycw26 #edge #ai #ycombinator

English

2

4

460

RunAnywhere (YC W26) retweetet

Shubham Malhotra@ShubhamMal72313·7 Mar

Came across a guide someone recently posted 😀 @RunAnywhereAI xugj520.cn/en/archives/ru…

English

Forward Future@ForwardFuture

2

4

386

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·7 Mar

MetalRT delivers the fastest TTS inference on Apple Silicon. Key results on M3 Max: - 291 ms latency for 5 words, 8.4x RTF, 2.8x faster than mlx - Lowest latency recorded: 291 ms - Peak RTF: 8.8x on longer inputs This enables instant-feeling text-to-speech directly on device. FASTEST voice AI pipeline on Apple silicon coming soon powered by @RunAnywhereAI #AppleSilicon #TTS #MetalRT #OnDeviceAI #runanywhere

English

10

31

306

34.9K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·6 Mar

Couldn’t agree more. @yoheinakajima isn’t just an investor, I still remember when our @RunAnywhereAI app was super early and janky, he could see our vision. We created a small baby AGI demo with running @liquidai models on that, and that was the “baby AGI” moment for us.

“The first time we realized agents were possible was watching @yoheinakajima build with Claude and Replit side by side.” @Replit CEO @amasad on the origins of coding agents: “He would prompt Claude Code, paste it into Replit, run it, get an error, copy the error back to Claude, and repeat.” “And watching that process, it became obvious.” “We could just automate the whole loop.” “That’s where agents start.”

English

2

9

1.2K

RunAnywhere (YC W26) retweetet

Sanchit monga@sanchitmonga22·6 Mar

Excited to share that @RunAnywhereAI was featured in International Business Times, Singapore: ibtimes.sg/runanywhere-in… A big reason this mission is personal to me is because growing up, I did not always have access to the best computer at home, and internet was not always available either. Sometimes Wi-Fi felt expensive, and sometimes my parents understandably did not see a reason to pay extra for it. That stayed with me. It shaped how I think about technology, access, and who gets left out when progress depends on constant connectivity and expensive hardware. Now, living in San Francisco, I see so many tools and technologies people take for granted every day. That contrast has only made this mission feel more urgent to me. We’re building RunAnywhere so AI can run directly on devices people already have, making it faster, more private, and far more accessible. Grateful to be building this with the support and community of @ycombinator behind us and ofcourse my cofounder @ShubhamMal72313 We believe the future of AI should not belong only to those with the best cloud infrastructure, but to everyone, everywhere. #ycombinator #ondeviceai #runanywhere #yc

English