Patton Lai

107 posts

Patton Lai

Patton Lai

@pattoniumbot

New York Присоединился Aralık 2023
190 Подписки33 Подписчики
Patton Lai
Patton Lai@pattoniumbot·
@teortaxesTex What do you think about MiniMax? Their mainline models are 230B, so quite small, and architecture-wise they're still using last gen attention, but their post-training stack seems quite strong?
English
1
0
0
234
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
thinking more about this, I'm too harsh V2.5 is very fast and multimodal, V2.5 pro has attention almost on par with DS and I'd say nicer style. They seem to have remarkably low hallucination rate. The ranking is the same but they're closer to the top. Not redundant at all
English
1
0
16
2.1K
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
this far not too impressed by MiMos as a coder or reasoner maybe it's a better agent half a tier below V4s, no matter what AA ranking says Current, unconfident Chyna ranking is Kimi > Whale (which is cheaper though) > GLM|Xiaomi, don't have a lot of experience with GLM
English
13
0
85
7.3K
Patton Lai
Patton Lai@pattoniumbot·
@ivanfioravanti I use Tailscale + Screens 5; works pretty well. The quality of the screen video feed isn't the best, but it's stable and I'd say it's good enough for most work.
English
0
0
0
149
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
Apple heavy users out there... what is the best way to connect to multiple Macs in screen sharing with top quality? Native one connect to only 1 device in high quality 😢
English
39
0
28
10.4K
Andrew Ng
Andrew Ng@AndrewYNg·
I'm excited about voice as a UI layer for existing visual applications — where speech and screen update together. This goes well beyond voice-only use cases like call center automation. The barrier has been a hard technical tradeoff: low-latency voice models lack reliability, while agentic pipelines (speech-to-text → LLM → text-to-speech) are intelligent but too slow for conversation. Ashwyn Sharma and team at Vocal Bridge (an AI Fund portfolio company) address this with a dual-agent architecture: a foreground agent for real-time conversation, a background agent for reasoning, guardrails, and tool calls. I used Vocal Bridge to add voice to a math-quiz app I'd built for my daughter; this took less than an hour with Claude Code. She speaks her answers, the app responds verbally and updates the questions and animations on screen. Only a tiny fraction of developers have ever built a voice app. If you'd like to try building one, check out Vocal Bridge for free: vocalbridgeai.com
Andrew Ng tweet media
English
97
103
752
104.4K
Patton Lai
Patton Lai@pattoniumbot·
@kimmonismus Gonna play the devil’s advocate here; the same could be said for Meta in late 2024, when Llama-3.1-405B-Instruct was released, which wasn't far off in capability from GPT-4o What do you think? Perhaps the difference is that model capabilities saturate "consumer chat" more now?
English
0
0
0
11
Chubby♨️
Chubby♨️@kimmonismus·
Meta's new model could pose a threat to only one company: OpenAI. OpenAI currently has 900 weekly users, 95% of whom are still in the free tier. It's arguably the best model for the average user, which is why most people use ChatGPT in their daily lives. With Spark, Meta has now developed a model that is being rolled out free of charge (!) to one billion users and is at least as useful for everyday use as ChatGPT. Let's be honest: 99% of people don't use LLMs for coding or frontier math, but for questions about tax returns, legal violations, brainstorming, or simply for chatting. ChatGPT and Spark are equally well-suited for this. Meta has the Moat distribution. If Meta succeeds in introducing Spark to its users and they realize that they now have a model within the Meta ecosystem that can address their concerns and needs just as effectively as ChatGPT, the consumer market could shift towards Meta. *That* could be dangerous for OpenAI. Because the business and enterprise sector is primarily located at Anthropic. OpenAI would also like more access to this market, but is currently still struggling for market share. OpenAI is deeply rooted in the consumer market. This is where Meta can become a real threat. That should set off alarm bells for OpenAI.
English
70
33
682
61.2K
Patton Lai
Patton Lai@pattoniumbot·
@mweinbach Maybe it’s coil whine from the power delivery? I get this on my M5 Max too when it’s pushing 100W+
English
0
0
0
20
Max Weinbach
Max Weinbach@mweinbach·
Has anyone else noticed like cracking sounds on their Windows laptops when you push the SoC really hard? I've noticed it on 4 laptops from 3 brands and it's the same sound. Sounds like pops or cracks?
English
30
1
94
18.6K
Patton Lai
Patton Lai@pattoniumbot·
@mweinbach Elevators would become the number one agent killer 😂
English
0
0
1
259
Max Weinbach
Max Weinbach@mweinbach·
I was just thinking about this, if Apple does add 5G modems to the M6 MacBook Pro, who needs a Mac Mini for your always on agent/codex remote device if it can just run for you 24/7 in your backpack. Apple's power draw is low enough they could actually make this work!
English
17
5
432
21.1K
Patton Lai
Patton Lai@pattoniumbot·
@Anoldoldwooden So I guess MacBooks should start showing "Intel Inside" 😉
English
0
0
1
220
Patton Lai
Patton Lai@pattoniumbot·
@fiveoutofnine `powermetrics` ships with every Mac, and allows you to poll GPU frequency and package power consumption! Perhaps this would be useful.
English
1
0
2
28
⁵⁄₉
⁵⁄₉@fiveoutofnine·
@pattoniumbot good idea, was gonna do this but couldn't find a reliable way to measure power
English
1
0
2
454
⁵⁄₉
⁵⁄₉@fiveoutofnine·
Introducing whatcani.run Find the best local models based on real data 1. People run and submit benchmarks 2. Stats aggregated over models / devices 3. Find the best model for you `npx whatcanirun`, fully open-source
English
73
131
1.7K
132.5K
Patton Lai
Patton Lai@pattoniumbot·
@fiveoutofnine What context lengths does it test at? tps can degrade significantly with longer context. > A benchmark I ran
Patton Lai tweet media
English
1
0
2
799
Patton Lai
Patton Lai@pattoniumbot·
@cherry_cc12 Great work! Tried it out via the Dashscope API, and the tool calling ability seems to be at a similar level to Gemini 3.1 Flash Live!
English
0
0
0
101
Chen Cheng
Chen Cheng@cherry_cc12·
Very excited about Qwen3.5-Omni. Native omni-modal, real-time, and the Audio-Visual Vibe Coding demo is genuinely fun. 🚀
Qwen@Alibaba_Qwen

🚀 Qwen3.5-Omni is here! Scaling up to a native omni-modal AGI. Meet the next generation of Qwen, designed for native text, image, audio, and video understanding, with major advances in both intelligence and real-time interaction. A standout feature: 'Audio-Visual Vibe Coding'. Describe your vision to the camera, and Qwen3.5-Omni-Plus instantly builds a functional website or game for you. Offline Highlights: 🎬 Script-Level Captioning: Generate detailed video scripts with timestamps, scene cuts & speaker mapping. 🏆 SOTA Performance: Outperform Gemini-3.1 Pro in audio and matches its audio-visual understanding. 🧠 Massive Capacity: Natively handle up to 10h of audio or 400s of 720p video, trained on 100M+ hours of data. 🌍 Global Reach: Recognize 113 languages (speech) & speaks 36. Real-time Features: 🎙️ Fine-Grained Voice Control: Adjust emotion, pace, and volume in real-time. 🔍 Built-in Web Search & complex function calling. 👤 Voice Cloning: Customize your AI's voice from a short sample, with engineering rollout coming soon. 💬 Human-like Conversation: Smart turn-taking that understands real intent and ignores noise. The Qwen3.5-Omni family includes Plus, Flash, and Light variants. Try it out: Blog: qwen.ai/blog?id=qwen3.… Realtime Interaction: click the VoiceChat/VideoChat button (bottom-right): chat.qwen.ai HF-Demo: huggingface.co/spaces/Qwen/Qw… HF-VoiceOnline-Demo: huggingface.co/spaces/Qwen/Qw… API-Offline: alibabacloud.com/help/en/model-… API-Realtime: alibabacloud.com/help/en/model-…

English
13
3
51
3.4K
Patton Lai
Patton Lai@pattoniumbot·
@cherry_cc12 Are there plans to open-source the weights for Qwen3.5-Omni, maybe the Flash variant?
English
0
0
1
92
Patton Lai
Patton Lai@pattoniumbot·
@kaiostephens Perhaps you’d need to manipulate the data to make it generalize better across different harnesses; I think MiniMax M2 trained heavily only on Claude Code, so it performed poorly on other harnesses
English
2
0
0
811
Bruno Le Hyaric
Bruno Le Hyaric@bu2twnext·
@ivanfioravanti Any sign of 4-bits acceleration like NVFP4/MXFP4? (or I may wait for the M6 😓)
English
2
0
1
416
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
M5 is really a big jump from architectural perspective for Apple Silicon! Accelerate your machine learning workloads with the M5 and A19 GPUs is a great video showing this!
Ivan Fioravanti ᯅ tweet media
English
2
19
192
9.6K
Patton Lai
Patton Lai@pattoniumbot·
@mweinbach I've been running into the same issue too; not sure why. Maybe it has to do errors in the chat template? Could also just be the model, but I imagine Qwen3.5 9B would be better than this.🤔
English
1
0
0
88
Max Weinbach
Max Weinbach@mweinbach·
i love when models forget how to call tools
Max Weinbach tweet media
English
4
0
39
4.2K
Patton Lai
Patton Lai@pattoniumbot·
@mweinbach There was an amazing video on this issue –– I'm noticing it on my M5 Max MacBook Pro as well. When running inference, package power sat at 70W, but wall socket power was at 120W+. youtube.com/watch?v=HKxIGg…
YouTube video
YouTube
English
0
0
0
162
Max Weinbach
Max Weinbach@mweinbach·
You know what's kinda wild, I've noticed that the memory controller on some SoCs pull more power for AI tasks than the GPU or memory itself Makes sense to some extent tbh
English
7
3
85
14.9K
Patton Lai
Patton Lai@pattoniumbot·
@ivanfioravanti Nice! Do you know if the weights will be open-sourced?
English
1
0
2
250
Patton Lai
Patton Lai@pattoniumbot·
@Zai_org Will the weights for `glm-5-turbo` be released on HuggingFace?
English
0
0
0
580