Jun Yamog

507 posts

Jun Yamog

@jkyamog

Katılım Mayıs 2012

189 Takip Edilen66 Takipçiler

Jun Yamog@jkyamog·3d

@Yuchenj_UW Skill.md issue

English

Yuchen Jin@Yuchenj_UW·6d

It's weird that the US still doesn’t have a truly competitive open-source model lab. It’s clearly not a money problem. Several neolabs have raised billions. It’s not a compute problem. US labs have easier access to B200s/B300s than Chinese labs. So what is the issue?

English

211

792

133.2K

Jun Yamog@jkyamog·3d

Anyone here still running an old V100 (Volta)? If so, could you test my patch? If not, you can still enjoy the comments of me being out of my depth and using Codex to get V100 working for Qwen 3.5 397B and GLM 4.5 🤣 github.com/ikawrakow/ik_l…

English

Jun Yamog@jkyamog·5d

@davideciffa @MKay88905412917 @csujun Oh I didn’t realize this. I will try tomorrow on my 5090. I also saw a bug I am tracking, under lots of interactions with Hermes and maybe close to 65k filled context sometimes I still get empty reply issue. So my bug fix was incomplete.

English

mrciffa@davideciffa·5d

@MKay88905412917 @csujun @jkyamog it already works with bigger quants, we optimize for rtx 3090 so we had to focus on q4, but if you change model and try on a bigger card like 5090 it should work without problems

English

102

mrciffa@davideciffa·6d

Big day for Lucebox! Codex, Hermes and OpenClaw now run locally on our speculative inference engine with Qwen3.6-27B. Full OpenAI tool-call compatibility. Thanks @csujun and @jkyamog for the great contribution. 🏎️

GIF

English

100

9.6K

Jun Yamog@jkyamog·5d

@LenProkopets @davideciffa @csujun Currently getting Qwen 3.5 397B running. Please test my patch #issuecomment-4417797159" target="_blank" rel="nofollow noopener">github.com/ikawrakow/ik_l…

English

LenP@LenProkopets·5d

@jkyamog @davideciffa @csujun Cool. I am looking forward to whatever you can do on v100s! I have 3 of them and want to use them to their full potential.

English

Jun Yamog@jkyamog·6d

@Raitziger @davideciffa @csujun I can try tomorrow but you might be able to get running in docker. I use docker in Linux via LXC. See my config here github.com/Luce-Org/luceb…

English

Riku Pasonen🌞@Raitziger·6d

@davideciffa @csujun @jkyamog Luce is great. I just had to go back to windows because I am ssd poor on gaming pc. Had a stab trying to build luce for windows but run into some ggml linking issue. Do you know anyone running Lucebox on winfow other than via wsl?

English

425

Jun Yamog@jkyamog·9 Kas

@kawaiiconNZ thanks for the great con... its been a while since I attended. I have given Zante a link to my photo album, maybe some of the photos/vidoes might be useful.

Jun Yamog@jkyamog

Someone @office asked me "have your learned anything yet?" Me: "No I don't come to @kiwicon to learn, I come for the lights, fire and sheep"

English

Jun Yamog@jkyamog·9 Kas

@pccourt I am a bit slow to reply and to drive... we finally got there about 3 years ago.

English

Philip Court@pccourt·3 Oca

Cape Reinga on the first day of 2015. Driving the length of NZ, EV power, no fossil fuel! #LeadingTheCharge

English

Jun Yamog@jkyamog·17 Nis

Which version is newer?

English

Jun Yamog@jkyamog·5 Nis

@larsmoravy Please try to fix the auto wipers. If not please offer a gradient control from low to max instead of the clunky I, II, III, IIII

English

Lars@larsmoravy·5 Nis

Let's make Teslas better... what do you all want for 2026?

English

13.1K

607

13.9K

3.9M

Jun Yamog@jkyamog·10 Mar

@gnukeith My last non-MacBook Pro laptop is a ThinkPad T520—still getting work done next to an M4 Max. Upgrading it soon with 16GB RAM and three internal SSDs.

English

Keith@gnukeith·9 Mar

The laptop market is terrible, the only decent laptops come from Apple.

English

3.2K

4.3K

131.7K

13.7M

Jun Yamog@jkyamog·5 Mar

@alexocheema @exolabs Thanks for explaining the memory refresh rate, I wasn’t aware of this. Also good to understand that architectures like MoE helps it. I guess the era of big VRAM is upon us. It kinda feels 128gb is like just a start.

English

540

Alex Cheema@alexocheema·5 Mar

Apple's timing could not be better with this. The M3 Ultra 512GB Mac Studio fits perfectly with massive sparse MoEs like DeepSeek V3/R1. 2 M3 Ultra 512GB Mac Studios with @exolabs is all you need to run the full, unquantized DeepSeek R1 at home. The first requirement for running these massive AI models is that they need to fit into GPU memory (in Apple's case, unified memory). Here's a quick comparison of how much that costs for different options (note: DIGITS is left out here since details are still unconfirmed): NVIDIA H100: 80GB @ 3TB/s, $25,000, $312.50 per GB AMD MI300X: 192GB @ 5.3TB/s, $20,000, $104.17 per GB Apple M2 Ultra: 192GB @ 800GB/s, $5,000, $26.04 per GB Apple M3 Ultra: 512GB @ 800GB/s, $9,500, $18.55 per GB That's a 28% reduction in $ per GB from the M2 Ultra - pretty good. The concerning thing here is the memory refresh rate. This is the ratio of memory bandwidth to memory of the device. It tells you how many times per second you could cycle through the entire memory on the device. This is the dominating factor for the performance of single request (batch_size=1) inference. For a dense model that saturates all of the memory of the machine, the maximum theoretical token rate is bound by this number. Comparison of memory refresh rate: NVIDIA H100 (80GB): 37.5/s AMD MI300X (192GB): 27.6/s Apple M2 Ultra (192GB): 4.16/s (9x less than H100) Apple M3 Ultra (512GB): 1.56/s (24x less than H100) Apple is trading off more memory for less memory refresh frequency, now 24x less than a H100. Another way to look at this is to analyze how much it costs per unit of memory bandwidth. Comparison of cost per GB/s of memory bandwidth (cheaper is better): NVIDIA H100 (80GB): $8.33 per GB/s AMD MI300X (192GB): $3.77 per GB/s Apple M2 Ultra (192GB): $6.25 per GB/s Apple M3 Ultra (512GB): $11.875 per GB/s There are two ways Apple wins with this approach. Both are hierarchical model structures that exploit the sparsity of model parameter activation: MoE and Modular Routing. MoE adds multiple experts to each layer and picks the top-k of N experts in each layer, so only k/N experts are active per layer. The more sparse the activation (smaller the ratio k/N) the better for Apple. DeepSeek R1 ratio is small: 8/256 = 1/32. Model developers could likely push this to be even smaller, potentially we might see a future where k/N is something like 8/1024 = 1/128 (<1% activated parameters). Modular Routing includes methods like DiPaCo and dynamic ensembles where a gating function activates multiple independent models and aggregates the results into one single result. For this, multiple models need to be in memory but only a few are active at any given time. Both MoE and Modular Routing require a lot of memory but not much memory bandwidth because only a small % of total parameters are active at any given time, which is the only data that actually needs to move around in memory. Funny story... 2 weeks ago I had a call with one of Apple's biggest competitors. They asked if I had a suggestion for a piece of AI hardware they could build. I told them, go build a 512GB memory Mac Studio-like box for AI. Congrats Apple for doing this. Something I thought would still take you a few years to do you did today. I'm impressed. Looking forward, there will likely be an M4 Ultra Mac Studio next year which should address my main concern since these Ultra chips use Apple UltraFusion to fuse Max dies. The M4 Max had a 36.5% increase in memory bandwidth compared to the M3 Max, so we should see something similar (or possibly more depending on the configuration) in the M4 Ultra.

English

115

306

2.8K

546.3K

Jun Yamog@jkyamog·23 Şub

@alexocheema @WSJ @naval Congrats been running exo on m4 max and m1 pro. Your mission to have AI distrubuted is important, good on you doing it. I hope though in the future distributed fine tuning/training will be possible w/o doing pytorch gymnastics.

English

Alex Cheema@alexocheema·21 Şub

8 months ago, exo was a hackathon project. Today it's front page of The Wall Street Journal @WSJ. We're a real company now (I guess..?), we raised some money from a few investors like @naval, hit #1 trending on github, published at ICML, shipped an enterprise product, and we're hiring. Our mission at @exolabs is simple. We don't want AI to be controlled by a few companies. We're making it more distributed.

English

701

56.1K

Jun Yamog@jkyamog·13 Kas

@KiwiAly @MattKingNorth Ok thanks for the info. From what we understand we have to show physical damage proof for ACC claim. Southern cross said they have multiple cases like her, where only symptoms occurs but no physical evidence. Southern cross has helped us with surgery, tests, etc.

English

Aly Cook@KiwiAly·12 Kas

@jkyamog @MattKingNorth make sure you ask ACC for a review .. this then goes to the ICRA independent authority .

English

Matt King Northland@MattKingNorth·4 Kas

NZ ACC HAS NOW PAID OUT $11,429,594 FOR COVID VACCINE INJURY.... New OIA Gov: 035284 Compare this with the less than $150,000 paid out for ALL VACCINE INJURY (excluding Covid) each year, in 2018 and 2019. Pfizer has complete INDEMNITY They told us it was safe and effective.

English

120

405

67.9K

Jun Yamog@jkyamog·12 Kas

@KiwiAly @MattKingNorth My wife is not counted in those numbers. ACC rejected it despite Southern Cross helping our case. ACC only accepts claims with physical injury. MRI can’t see any physical issues despite all the symptoms. 3 years on she is getting other symptoms.

English

Aly Cook@KiwiAly·4 Kas

This was my OIA .. What is interesting is I have another medsafe OIA H2023033888 which shows the updated figure is now at 20,559 SERIOUS LIFE THREATENING INJURIES OTHER THAN DEATH . Yet only 1600+ claims through ACC have been approved for that total $11,429,594. Their figures do not stack up! (Only 1600 people of 20559 serious live threatening injury victims have made an ACC claims or have had their ED treatment go via ACC ? That doesnt seem right and its not ..

English

129

2.1K

Jun Yamog@jkyamog·12 Kas

@JoeTurco @elonmusk That already works long time ago. Have used it for a while already tesla.com/en_nz/support/…

English

Joe Turco@JoeTurco·12 Kas

@elonmusk Solar and Wind are "cute" but cannot replace stead state power generation. If you could remotely tell Tesla Model Y to charge during times of day with excessive local amounts of solar or wind, then I might change my mind.

English

201

Elon Musk@elonmusk·12 Kas

Solar power will be the vast majority of power generation in the future

ALEX@ajtourville

NEWS: Rooftop solar delivers milestone of 80.5% share of electricity generation in Western Australia 🇦🇺 Yesterday at 1:30 PM, distributed solar PV accounted for 2.12 gigawatts of output, with natural gas and coal both reduced to shares of 8.6% and 8.3% respectively.

English

20.4K

14.9K

145.8K

52M

Jun Yamog@jkyamog·9 Kas

@alexocheema @exolabs Wow great interesting to see if you can benchmark 1 M4 Max 128gb vs 2 M4 pro 64gb. That way we get 128gb and 40gpu, hoe much would the thunderbolt 5 overhead would be.

English

156

Alex Cheema@alexocheema·9 Kas

M4 Mac Mini AI Cluster Uses @exolabs with Thunderbolt 5 interconnect (80Gbps) to run LLMs distributed across 4 M4 Pro Mac Minis. The cluster is small (iPhone for reference). It’s running Nemotron 70B at 8 tok/sec and scales to Llama 405B (benchmarks soon).

English

587

2.5K

25.2K

3.5M

Jun Yamog@jkyamog·16 Eki

@JREOfficial Forgot the exact values but they are close the recommended values. Lora scale .8 steps 38, guidance slightly higher. Btw that photo is my wife’s real photo. This is the generated photo.

English

John E@JREOfficial·16 Eki

@jkyamog how / what are the inputs you are setting? The lora scale, inference steps, guidance scale, prompt strength. Your images look great

English

Jun Yamog@jkyamog·11 Eki

@levelsio I didn’t think about until now. Tried runway a few months, was ok but changed the image source too much. Hailuo last week was much better, video kept my wife’s look. We where playing her as 50ft woman 😅

English

164

@levelsio@levelsio·11 Eki

There's so many video models now and the top are all Chinese models, not Western: - Kling AI - Hailuo - Qingying / Zhipu AI Luma and Runway are way behind the Chinese video AI models IMHO Interesting to see Chinese miss the boat on LLMs (ChatGPT) and AI imaging (SD and Flux) but now catch back up and leading in AI video!

Fekri@fekdaoui

@levelsio + AI providers like runwayml are slowly starting to roll out their APIs turbo acceleration incoming

English

901

268K

Jun Yamog@jkyamog·1 Eki

@levelsio I tried this a few months ago after I trained flux with wife’s images. I then tagged 2 person: me and my wife. But then all the images it generated looked like our son and daughter 😅 My wife said “you can now confirm you are the dad!” Idea: AI paternal test!

English

@levelsio@levelsio·1 Eki

Who can help me think of some genius pipeline solution to get group photos working on Photo AI combining multiple trained people? Right now combining 2 LoRas of people will result in a merger of 2 people

English

177

93.1K

Keşfet

@Yuchenj_UW @davideciffa @MKay88905412917 @csujun @LenProkopets @Raitziger @kawaiiconNZ @pccourt