Jun Yamog

507 posts

Jun Yamog

Jun Yamog

@jkyamog

Katılım Mayıs 2012
189 Takip Edilen66 Takipçiler
Yuchen Jin
Yuchen Jin@Yuchenj_UW·
It's weird that the US still doesn’t have a truly competitive open-source model lab. It’s clearly not a money problem. Several neolabs have raised billions. It’s not a compute problem. US labs have easier access to B200s/B300s than Chinese labs. So what is the issue?
English
211
27
792
133.2K
Jun Yamog
Jun Yamog@jkyamog·
Anyone here still running an old V100 (Volta)? If so, could you test my patch? If not, you can still enjoy the comments of me being out of my depth and using Codex to get V100 working for Qwen 3.5 397B and GLM 4.5 🤣 github.com/ikawrakow/ik_l…
English
0
0
0
53
Jun Yamog
Jun Yamog@jkyamog·
@davideciffa @MKay88905412917 @csujun Oh I didn’t realize this. I will try tomorrow on my 5090. I also saw a bug I am tracking, under lots of interactions with Hermes and maybe close to 65k filled context sometimes I still get empty reply issue. So my bug fix was incomplete.
English
0
0
1
33
mrciffa
mrciffa@davideciffa·
@MKay88905412917 @csujun @jkyamog it already works with bigger quants, we optimize for rtx 3090 so we had to focus on q4, but if you change model and try on a bigger card like 5090 it should work without problems
English
2
0
2
102
mrciffa
mrciffa@davideciffa·
Big day for Lucebox! Codex, Hermes and OpenClaw now run locally on our speculative inference engine with Qwen3.6-27B. Full OpenAI tool-call compatibility. Thanks @csujun and @jkyamog for the great contribution. 🏎️
GIF
English
10
11
100
9.6K
LenP
LenP@LenProkopets·
@jkyamog @davideciffa @csujun Cool. I am looking forward to whatever you can do on v100s! I have 3 of them and want to use them to their full potential.
English
1
0
1
20
Riku Pasonen🌞
Riku Pasonen🌞@Raitziger·
@davideciffa @csujun @jkyamog Luce is great. I just had to go back to windows because I am ssd poor on gaming pc. Had a stab trying to build luce for windows but run into some ggml linking issue. Do you know anyone running Lucebox on winfow other than via wsl?
English
3
0
3
425
Jun Yamog
Jun Yamog@jkyamog·
@kawaiiconNZ thanks for the great con... its been a while since I attended. I have given Zante a link to my photo album, maybe some of the photos/vidoes might be useful.
Jun Yamog tweet mediaJun Yamog tweet mediaJun Yamog tweet mediaJun Yamog tweet media
Jun Yamog@jkyamog

Someone @office asked me "have your learned anything yet?" Me: "No I don't come to @kiwicon to learn, I come for the lights, fire and sheep"

English
0
0
0
7
Jun Yamog
Jun Yamog@jkyamog·
@pccourt I am a bit slow to reply and to drive... we finally got there about 3 years ago.
Jun Yamog tweet media
English
0
0
1
13
Philip Court
Philip Court@pccourt·
Cape Reinga on the first day of 2015. Driving the length of NZ, EV power, no fossil fuel! #LeadingTheCharge
Philip Court tweet media
English
2
3
1
0
Jun Yamog
Jun Yamog@jkyamog·
Which version is newer?
Jun Yamog tweet media
English
0
0
1
53
Jun Yamog
Jun Yamog@jkyamog·
@larsmoravy Please try to fix the auto wipers. If not please offer a gradient control from low to max instead of the clunky I, II, III, IIII
English
0
0
0
5
Lars
Lars@larsmoravy·
Let's make Teslas better... what do you all want for 2026?
English
13.1K
607
13.9K
3.9M
Jun Yamog
Jun Yamog@jkyamog·
@gnukeith My last non-MacBook Pro laptop is a ThinkPad T520—still getting work done next to an M4 Max. Upgrading it soon with 16GB RAM and three internal SSDs.
Jun Yamog tweet mediaJun Yamog tweet media
English
0
0
0
50
Keith
Keith@gnukeith·
The laptop market is terrible, the only decent laptops come from Apple.
English
3.2K
4.3K
131.7K
13.7M
Jun Yamog
Jun Yamog@jkyamog·
@alexocheema @exolabs Thanks for explaining the memory refresh rate, I wasn’t aware of this. Also good to understand that architectures like MoE helps it. I guess the era of big VRAM is upon us. It kinda feels 128gb is like just a start.
English
0
0
1
540
Alex Cheema
Alex Cheema@alexocheema·
Apple's timing could not be better with this. The M3 Ultra 512GB Mac Studio fits perfectly with massive sparse MoEs like DeepSeek V3/R1. 2 M3 Ultra 512GB Mac Studios with @exolabs is all you need to run the full, unquantized DeepSeek R1 at home. The first requirement for running these massive AI models is that they need to fit into GPU memory (in Apple's case, unified memory). Here's a quick comparison of how much that costs for different options (note: DIGITS is left out here since details are still unconfirmed): NVIDIA H100: 80GB @ 3TB/s, $25,000, $312.50 per GB AMD MI300X: 192GB @ 5.3TB/s, $20,000, $104.17 per GB Apple M2 Ultra: 192GB @ 800GB/s, $5,000, $26.04 per GB Apple M3 Ultra: 512GB @ 800GB/s, $9,500, $18.55 per GB That's a 28% reduction in $ per GB from the M2 Ultra - pretty good. The concerning thing here is the memory refresh rate. This is the ratio of memory bandwidth to memory of the device. It tells you how many times per second you could cycle through the entire memory on the device. This is the dominating factor for the performance of single request (batch_size=1) inference. For a dense model that saturates all of the memory of the machine, the maximum theoretical token rate is bound by this number. Comparison of memory refresh rate: NVIDIA H100 (80GB): 37.5/s AMD MI300X (192GB): 27.6/s Apple M2 Ultra (192GB): 4.16/s (9x less than H100) Apple M3 Ultra (512GB): 1.56/s (24x less than H100) Apple is trading off more memory for less memory refresh frequency, now 24x less than a H100. Another way to look at this is to analyze how much it costs per unit of memory bandwidth. Comparison of cost per GB/s of memory bandwidth (cheaper is better): NVIDIA H100 (80GB): $8.33 per GB/s AMD MI300X (192GB): $3.77 per GB/s Apple M2 Ultra (192GB): $6.25 per GB/s Apple M3 Ultra (512GB): $11.875 per GB/s There are two ways Apple wins with this approach. Both are hierarchical model structures that exploit the sparsity of model parameter activation: MoE and Modular Routing. MoE adds multiple experts to each layer and picks the top-k of N experts in each layer, so only k/N experts are active per layer. The more sparse the activation (smaller the ratio k/N) the better for Apple. DeepSeek R1 ratio is small: 8/256 = 1/32. Model developers could likely push this to be even smaller, potentially we might see a future where k/N is something like 8/1024 = 1/128 (<1% activated parameters). Modular Routing includes methods like DiPaCo and dynamic ensembles where a gating function activates multiple independent models and aggregates the results into one single result. For this, multiple models need to be in memory but only a few are active at any given time. Both MoE and Modular Routing require a lot of memory but not much memory bandwidth because only a small % of total parameters are active at any given time, which is the only data that actually needs to move around in memory. Funny story... 2 weeks ago I had a call with one of Apple's biggest competitors. They asked if I had a suggestion for a piece of AI hardware they could build. I told them, go build a 512GB memory Mac Studio-like box for AI. Congrats Apple for doing this. Something I thought would still take you a few years to do you did today. I'm impressed. Looking forward, there will likely be an M4 Ultra Mac Studio next year which should address my main concern since these Ultra chips use Apple UltraFusion to fuse Max dies. The M4 Max had a 36.5% increase in memory bandwidth compared to the M3 Max, so we should see something similar (or possibly more depending on the configuration) in the M4 Ultra.
Alex Cheema tweet mediaAlex Cheema tweet media
English
115
306
2.8K
546.3K
Jun Yamog
Jun Yamog@jkyamog·
@alexocheema @WSJ @naval Congrats been running exo on m4 max and m1 pro. Your mission to have AI distrubuted is important, good on you doing it. I hope though in the future distributed fine tuning/training will be possible w/o doing pytorch gymnastics.
English
0
0
0
75
Alex Cheema
Alex Cheema@alexocheema·
8 months ago, exo was a hackathon project. Today it's front page of The Wall Street Journal @WSJ. We're a real company now (I guess..?), we raised some money from a few investors like @naval, hit #1 trending on github, published at ICML, shipped an enterprise product, and we're hiring. Our mission at @exolabs is simple. We don't want AI to be controlled by a few companies. We're making it more distributed.
Alex Cheema tweet mediaAlex Cheema tweet media
English
55
56
701
56.1K
Jun Yamog
Jun Yamog@jkyamog·
@KiwiAly @MattKingNorth Ok thanks for the info. From what we understand we have to show physical damage proof for ACC claim. Southern cross said they have multiple cases like her, where only symptoms occurs but no physical evidence. Southern cross has helped us with surgery, tests, etc.
English
0
0
0
9
Aly Cook
Aly Cook@KiwiAly·
@jkyamog @MattKingNorth make sure you ask ACC for a review .. this then goes to the ICRA independent authority .
English
1
0
0
44
Matt King Northland
Matt King Northland@MattKingNorth·
NZ ACC HAS NOW PAID OUT $11,429,594 FOR COVID VACCINE INJURY.... New OIA Gov: 035284 Compare this with the less than $150,000 paid out for ALL VACCINE INJURY (excluding Covid) each year, in 2018 and 2019. Pfizer has complete INDEMNITY They told us it was safe and effective.
English
30
120
405
67.9K
Jun Yamog
Jun Yamog@jkyamog·
@KiwiAly @MattKingNorth My wife is not counted in those numbers. ACC rejected it despite Southern Cross helping our case. ACC only accepts claims with physical injury. MRI can’t see any physical issues despite all the symptoms. 3 years on she is getting other symptoms.
English
1
0
0
23
Aly Cook
Aly Cook@KiwiAly·
This was my OIA .. What is interesting is I have another medsafe  OIA H2023033888 which shows the updated figure is now at  20,559 SERIOUS  LIFE THREATENING INJURIES OTHER THAN DEATH . Yet only 1600+ claims through ACC have been approved for that  total $11,429,594. Their figures do not stack up! (Only 1600 people of 20559 serious live threatening injury victims have made an ACC claims or have had their ED treatment go via ACC ? That doesnt seem right and its not ..
English
15
25
129
2.1K
Joe Turco
Joe Turco@JoeTurco·
@elonmusk Solar and Wind are "cute" but cannot replace stead state power generation. If you could remotely tell Tesla Model Y to charge during times of day with excessive local amounts of solar or wind, then I might change my mind.
English
1
0
1
201
Jun Yamog
Jun Yamog@jkyamog·
@alexocheema @exolabs Wow great interesting to see if you can benchmark 1 M4 Max 128gb vs 2 M4 pro 64gb. That way we get 128gb and 40gpu, hoe much would the thunderbolt 5 overhead would be.
English
0
0
0
156
Alex Cheema
Alex Cheema@alexocheema·
M4 Mac Mini AI Cluster Uses @exolabs with Thunderbolt 5 interconnect (80Gbps) to run LLMs distributed across 4 M4 Pro Mac Minis. The cluster is small (iPhone for reference). It’s running Nemotron 70B at 8 tok/sec and scales to Llama 405B (benchmarks soon).
English
587
2.5K
25.2K
3.5M
Jun Yamog
Jun Yamog@jkyamog·
@JREOfficial Forgot the exact values but they are close the recommended values. Lora scale .8 steps 38, guidance slightly higher. Btw that photo is my wife’s real photo. This is the generated photo.
Jun Yamog tweet media
English
0
0
0
31
John E
John E@JREOfficial·
@jkyamog how / what are the inputs you are setting? The lora scale, inference steps, guidance scale, prompt strength. Your images look great
John E tweet media
English
1
0
0
24
Jun Yamog
Jun Yamog@jkyamog·
@levelsio I didn’t think about until now. Tried runway a few months, was ok but changed the image source too much. Hailuo last week was much better, video kept my wife’s look. We where playing her as 50ft woman 😅
Jun Yamog tweet media
English
0
0
2
164
@levelsio
@levelsio@levelsio·
There's so many video models now and the top are all Chinese models, not Western: - Kling AI - Hailuo - Qingying / Zhipu AI Luma and Runway are way behind the Chinese video AI models IMHO Interesting to see Chinese miss the boat on LLMs (ChatGPT) and AI imaging (SD and Flux) but now catch back up and leading in AI video!
@levelsio tweet media@levelsio tweet media@levelsio tweet media
Fekri@fekdaoui

@levelsio + AI providers like runwayml are slowly starting to roll out their APIs turbo acceleration incoming

English
50
82
901
268K
Jun Yamog
Jun Yamog@jkyamog·
@levelsio I tried this a few months ago after I trained flux with wife’s images. I then tagged 2 person: me and my wife. But then all the images it generated looked like our son and daughter 😅 My wife said “you can now confirm you are the dad!” Idea: AI paternal test!
English
0
0
0
65
@levelsio
@levelsio@levelsio·
Who can help me think of some genius pipeline solution to get group photos working on Photo AI combining multiple trained people? Right now combining 2 LoRas of people will result in a merger of 2 people
English
64
1
177
93.1K