Jordan

550 posts

Jordan banner
Jordan

Jordan

@JordanDevAi

AI and Tech Focused. I build apps, solutions, and share my thoughts here.

South Florida 参加日 Nisan 2025
101 フォロー中64 フォロワー
固定されたツイート
Jordan
Jordan@JordanDevAi·
Ai Note taking app - I install and use AI Locally on your devices for offline use. Mindsort.app is the first of many of my AI initiatives!
English
0
0
4
298
Jordan
Jordan@JordanDevAi·
@HowToAI_ Game development is only going to get so much more wild with the tech as time goes on. I'm excited to see the future of new indie games
English
0
0
12
3.4K
How To AI
How To AI@HowToAI_·
Microsoft has released a 4B parameter model that turns any image into a 3D asset in 3 seconds. It uses a new geometry format called O-Voxel that converts to a textured mesh in under 100ms on CUDA. Outputs GLB files with full PBR textures, ready for Blender, Unity, and Unreal. 100% Open Source.
English
49
390
3.6K
253.6K
Enrico Magni
Enrico Magni@EnricoMagni1·
@chesnyfcb How do the local models perform compare to Claude though?
English
2
0
1
1.7K
Chesny
Chesny@chesnyfcb·
Un tipo pagaba $200 al mes por Claude Max. Se fundió su suscripción en 3 horas de trabajo. Compró un Mac Mini básico por $599. Le instaló 5 modelos locales. Un comando. Un flag. Sus vecinos de oficina pensaban que estaba minando criptomonedas. Simplemente le enseñó a la máquina a clasificar mensajes, comprimir el contexto y mantener el sistema vivo mientras él dormía. A las 4 a.m., Claude alcanzó su límite de peticiones. El modelo local tomó el relevo. Por la mañana, leyó los logs: todo funcionó. Ni siquiera tuvo que despertarse. Un equipo haciendo lo mismo significa 3 ingenieros y $15.000 al mes en costes de API. Él pagó $599 una sola vez. 35 mil millones de parámetros en 16 gigas de memoria. Todos decían que era imposible. Un flag en un comando les demostró a todos que estaban equivocados. Y de personas como él... solo hay un puñado hasta ahora.
Chesny@chesnyfcb

x.com/i/article/2051…

Español
83
256
2.8K
1.9M
Jordan
Jordan@JordanDevAi·
Drinking hydrogenized water every morning. Anything to help clean up the oxidative stress on the brain.
Jordan tweet media
English
0
0
2
12
Jordan
Jordan@JordanDevAi·
@RealProductGirl Welcome back. Here's to a productive and successful week 💪
English
0
0
0
4
Samantha Simonhoff
Samantha Simonhoff@RealProductGirl·
And we back, fam ✨ New week, fresh energy, clean slate. Whatever you’re building, show up for it today. Have a beautiful start to your week 🤍
Samantha Simonhoff tweet media
English
30
1
62
1.7K
Jordan
Jordan@JordanDevAi·
@RealProductGirl Bummer! Plumbing issues suck! Hopefully it all gets resolved quickly!
English
0
0
0
7
haareblond 🇪🇺
haareblond 🇪🇺@HaareBlond·
@JordanDevAi @stevibe @TeksEdge The pic is real, but it's 21 requests batched together in vLLM. A single stream of Gemma 4 31B Dense on an RTX 5090 is roughly 50–90 tok/s, not 500+. And a 3090 on dense 30B is ~35 tok/s, not 300. The claim confuses total batched throughput with per-user speed
English
1
0
1
76
stevibe
stevibe@stevibe·
Qwen3.6 27B landed yesterday, so I ran it on 4 setups side-by-side to see how they stack up: 🔴 RTX 4090 — 45.59 tok/s, TTFT 525ms 🟢 RTX 5090 — 51.83 tok/s, TTFT 752ms ⚫️ M2 Ultra — 22.30 tok/s, TTFT 216ms 🟣 DGX Spark — 11.08 tok/s, TTFT 319ms This is a standard test: no tuning, just the out-of-the-box experience. For the NVIDIA cards I used llama.cpp with Unsloth's UD-Q4_K_XL quant. For the M2 Ultra I used MLX with Unsloth's UD-MLX-4bit quant, since MLX is the native path on Apple Silicon. Please consider this as the baseline, you can definitely squeeze more out of every one of these with fine-tuned settings.
English
81
73
886
102.4K
Liquid AI
Liquid AI@liquidai·
We’re entering a multi-year partnership with @MercedesBenz to scale embedded, on-device intelligence for their third- and fourth-generation MBUX. Our goal: to make the driver/vehicle relationship even more natural and effortless. Read more about our partnership: liquid.ai/press/liquid-a…
Liquid AI tweet media
English
18
47
225
42.1K
Jordan
Jordan@JordanDevAi·
@stevibe @TeksEdge I have no problem getting 500-700 tok/s on Dense 30B models on my 5090 and 300 tok/sec on my 3090. Its not so much the quant as it is compiling Llama.cpp for your specific hardware. This pic is Gemma4 running their dense 31b at 500+ tok/sec.
Jordan tweet media
English
2
0
0
182
stevibe
stevibe@stevibe·
@TeksEdge Not sure, I'm going to follow some guides and verify them.
English
3
0
13
3.6K
Jordan
Jordan@JordanDevAi·
@googlegemma And by both, I mean dual instances of Gemma4
English
0
0
2
248
Jordan
Jordan@JordanDevAi·
@googlegemma I've been running both 24/7 for the last 6 days:
Jordan tweet media
English
1
0
31
5.1K
Google Gemma
Google Gemma@googlegemma·
What does it take to run 3, 5, or even 10 concurrent instances of Gemma 4 locally? We've open-sourced a demo letting you run multiple models side-by-side on your hardware. Gemma 4 26B A4B easily runs 10+ concurrent requests on a MacBook Pro M4 Max at 18 tokens/sec per request.
English
99
428
5.1K
911.1K
Samantha Simonhoff
Samantha Simonhoff@RealProductGirl·
Does anyone remember their last night before a move? Was sleep non-existent? I’m getting the feeling I won’t get any 🥺
Samantha Simonhoff tweet media
English
19
0
55
947
Jordan
Jordan@JordanDevAi·
@BuescherScott Yo. That's wild. I recently formed a Real estate tech company with a partner who owns a realtor company in South Florida. We're publicly launching soon. I'll follow you back and check out your project.
English
1
0
1
25
Samantha Simonhoff
Samantha Simonhoff@RealProductGirl·
Moving sucks...1 out of 5 stars...would not recommend Hope everyone has an awesome start to their Monday!
English
31
2
89
1.6K
Jordan
Jordan@JordanDevAi·
@RealProductGirl Like when I saw your post at 3am eastern time and I responded the other day lol
English
1
0
2
18
Samantha Simonhoff
Samantha Simonhoff@RealProductGirl·
If I build it, they will come. So I keep grinding for every builder who's up late shipping, debugging, and refusing to quit. Your work matters. Keep Building. 🔨 I have you.
English
18
6
56
755
Jordan
Jordan@JordanDevAi·
@petergyang If going for system ram, get as much as you can afford then use MoE and offload layers to CPU/GPU. Then you can run 80B+ MoE models
English
1
0
0
227
Peter Yang
Peter Yang@petergyang·
What is the sweet spot in open source model size? Are 35B models enough for local agentic workflows? Trying to decide how much RAM I need in a new computer.
Qwen@Alibaba_Qwen

⚡ Meet Qwen3.6-35B-A3B:Now Open-Source!🚀🚀 A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. 🔥 Agentic coding on par with models 10x its active size 📷 Strong multimodal perception and reasoning ability 🧠 Multimodal thinking + non-thinking modes Efficient. Powerful. Versatile. Try it now👇 Blog:qwen.ai/blog?id=qwen3.… Qwen Studio:chat.qwen.ai HuggingFace:huggingface.co/Qwen/Qwen3.6-3… ModelScope:modelscope.cn/models/Qwen/Qw… API(‘Qwen3.6-Flash’ on Model Studio):Coming soon~ Stay tuned

English
78
3
85
34.6K