damageboy

15.7K posts

damageboy

@damageboy

Making the world a better place through page tables Works at #5700FF Pronouns: Has/Been

Antarctica Katılım Mayıs 2008

964 Takip Edilen2.2K Takipçiler

Sabitlenmiş Tweet

damageboy@damageboy·30 Oca

My vectorized sorting extravaganza is out and about. A LOT of work went into this (referring to the javascript in the posts :) Read, retweet, send scathing reviews, open issues, heat your house with it: This goes to Eleven! Pt. 1-3, code, nuget out: bits.houmus.org/2020-01-28/thi…

English

182

damageboy@damageboy·12h

@PnL63962200 @DanielBachmat @mathandcobb אני כבר שנתיים מנסה להבין מאנשי פיזיקל דיזיין למה לא נותנים לי לצייר חוט במעבד בזווית של 45⁰ ואתה רוצה את זה? פיזדטז

עברית

PnL@PnL63962200·17h

@DanielBachmat @mathandcobb יש לזה משמעויות פרקטיות על משהו? נגיד מעבדים יותר מהירים? 😂

עברית

1.5K

Daniel Bachmat@DanielBachmat·18h

רבים בתגובות ביקשו תמונה של איך נראה הפתרון של ChatGPT לבעיה. לצערנו לא פרקטי לצייר את הפתרון המלא (יותר מידי נקודות וקווים) אבל כאן רואים ציור חלקי שדומה לפתרון (קרדיט: @mathandcobb). יצא מאוד אמנותי. תזכורת מה רואים כאן: זה אוסף של נקודות בצהוב שהמון מהמרחקים הפנימיים בין נקודות (מיוצג על ידי קווים כחולים) זהים באורכם. הבינה המלאכותית מצאה דרך לייצר יותר קווים כחולים עם מספר נתון של נקודות מהשיטה הכי טובה שהייתה ידועה לפני (שהייתה לסדר את הנקודות בשורות ועמודות). הרחבה מתמטית למטיבי לכת לאיך מגיעים לגרף הזה: הרעיון הבסיסי הוא לחשוב על המישור כמישור המרוכב. הנקודות הצהובות הן ההיטל של סריג חוג השלמים של שדה מספרים כלשהו (הרחבה אלגברית של שדה המספרים הרציונליים) על המישור הזה. בחירה נבונה של השדה ותכונותיו האלגבריות נותנת פתרון לבעיה.

Daniel Bachmat@DanielBachmat

אחרי הטיזר, צללתי לפרטים וזה באמת סיפור נפלא. בשרשור הזה אסביר מה ההישג יוצא הדופן ש-chatGPT הגיע אליו ולמה זה הרבה יותר מרשים מכל מה ש-AI עשה במתמטיקה עד היום.🧵

עברית

304

37.1K

damageboy@damageboy·17h

@realmemes6 Also, what is limiting bw right now is the pins, less so the HBM itself. A true 3D packaged dram on top of compute, if someone knows how to solve the cooling challenge has another easy 8x in the current HBM tech. It's all pin limited where we stand.

English

siliconmemes@realmemes6·18h

@damageboy Led astray by AI, I should have just left HBM out of this, my main point was laser links between small satellites in space replacing datacenters on Earth. But even then they can be used for inference and not training.

English

siliconmemes@realmemes6·1d

The future with space datacenters and LPDDR replacing HBM is very fun, but don't forget the physics. The speed of light in a vacuum is 33% to 50% faster than copper, and not faster at all in fiber optic cable. Every foot (30cm) at speed of light in a vacuum adds 1ns of latency.

English

611

damageboy@damageboy·17h

@realmemes6 Yeah, regardless, I don't understand the hate with HBM, 3D packaging, in general is a very "easy" win. Eventually cooling of such contraptions will be solved. Will love to read a well argued counterpoint, let's see what @insane_analyst writes

English

damageboy@damageboy·22h

@realmemes6 HBM access latency is much closer to high 80s, low 90s, I don't know what subset of the access path you had to go for to get a 50ns number. And this is without significant reordering at the controller which usually adds another 50ish cycles for a reorder queue of 96

English

siliconmemes@realmemes6·1d

For context HBM4 access latency is less than 50ns, so if memory stays on the board the round trip latency added by distance is a few nanoseconds but wouldn't be a deal killer by itself.

English

241

damageboy@damageboy·1d

@NBhgdrh הוויכוח הוא מול סוני, בעיקר אם אתה מעריך מוזיקה.

עברית

216

דג הכסף@NBhgdrh·1d

קל. ברור. רק אפל. לא מבין את הוויכוח בכלל

Gersh@Gersh_Brosh

קניתי איירפודס אחרי 5 שנים עם Bose המחשבה היחידה בראשי: איך לא עברתי קודם🤦🏻‍♂️ ברצינות, אין מה להשוות בכלל

עברית

13.2K

damageboy@damageboy·1d

@talitshe הוא ממש לא גרוע. יש לי פרוייקט נקי ממאי השנה על ג'פיטי 5.5 והכל נראה מעולה, בהנחה שמשקיעים 5 דקות בלהבהיר פרקטיקות פרוייקט, כמו בכל פרוייקט ק"פפ

עברית

233

טלית שכולה@talitshe·1d

לעת עתה, CPP מספקת "job security" למתכנתים: קוד שמיוצר ע"י LLMs הוא... גרוע. מי שיכול, מתמודד (או, אם המעסיק מאפשר, נמנע מ-AI). מי שלא מסוגל, פורש. מי שחכמה - מחליפה לשפה ראויה יותר. (כן, יש בעיה אמיתית של קוד legacy שאי אפשר פשוט להחליף...)

Haider.@haider1

Creator of C++, Bjarne Stroustrup: AI-generated code isn't ready — it generates more bugs, more bloat, more security holes, and is nearly impossible to validate "senior developers are already retiring rather than deal with it" The problem is that even a small prompt change can shift the entire codebase in unpredictable ways

עברית

8.4K

damageboy@damageboy·1d

@MikaZomer @MeravYuval @amsterdamski2 4.06 באוקטובר 2023 זה פחות מעשור ומעל ל4

עברית

Mika Zomer@MikaZomer·1d

@damageboy @MeravYuval @amsterdamski2 ?

QAM

Shaul Amsterdamski 🤞@amsterdamski2·2d

בשכונה האהובה שלי התרבו השלטים על מכירת בתים. באמת, מלא. תוהה לאן כל האנשים האלה הלכו אבל זה לציוץ אחר. אחד מהם של בית ייחודי מראשית ימי השכונה שהלוואי שהייתי יכול לגור בו. לפני כמה חודשים הציעו אותו למכירה ב-12 מיליון שקל. בקטנה. כעת המתווכת הורידה ל-9.9 מיליון. נמשיך לחכות :)

עברית

1.7K

139.7K

damageboy@damageboy·1d

@insane_analyst How soon can that realistically overtake hbm in price? And in pJ/bit?

English

364

Irrational Analysis@insane_analyst·2d

Or just use LPDDR attached via optics. 😉

Jukan@jukan05

Breaking the "Memory Wall": Optical Interconnects Emerge in GPU–HBM Packaging As a solution to the "memory wall," one of the chronic challenges in AI semiconductors, the memory and packaging industries at home and abroad are weighing an approach that decouples the GPU and high-bandwidth memory (HBM) and packages them separately. The core idea is to move the HBM—until now mounted right next to the GPU—a certain distance away, and bridge the gap with light (optics), allowing several times more HBM to be installed than is possible today. On the 22nd, a researcher at a major domestic memory maker said, "We're currently struggling to expand HBM bandwidth and capacity, so we're discussing with customers a plan to overcome the GPU's shoreline limit through optical interconnects and mount more HBM." Shoreline refers to the length of the chip's perimeter. In today's AI computing environment, the key factor dragging down compute efficiency is the data transfer speed of memory chips. While GPU performance has grown by leaps and bounds with each generation, the speed at which memory stores and supplies data has failed to keep pace—creating a structural performance barrier, the memory wall. The arrival of HBM, with its wide data pathways, put out the immediate fire, but critics continue to point out that bandwidth and transfer speeds still fall short of handling the explosive growth in AI compute. Until now, the industry has focused on stacking HBM ever higher to increase memory capacity and bandwidth within a confined footprint. But as stack counts climbed past 12 and 16 layers toward 20 and beyond, process difficulty rose exponentially. The technology hit physical limits, including the growing difficulty of meeting fixed height specifications. Vertical stacking has reached an inflection point—so much so that the JEDEC standards body has relaxed its HBM height specifications. The bigger problem is that if stack counts can't be raised, the alternative is to add more HBM horizontally around the GPU—but that, too, is impossible. In the current 2.5D packaging structure, the GPU and HBM are mounted tightly together on a single substrate. Within this structure, the number of HBM units that can be placed is strictly limited by the finite length of the GPU chip's perimeter—its shoreline. Even when more HBM is desired, there is physically no room to place it, leaving the industry in a structural deadlock. The alternative now emerging across the semiconductor industry is to separate the GPU and HBM and package them independently. It overturns the conventional chip-design principle that components must sit close together to minimize data transfer time. Instead of keeping the two chips adjacent, the approach spaces them apart and links them with overwhelmingly fast optical signals to overcome the added physical distance. Placing the HBM slightly away from the GPU within the board frees the design from the GPU's shoreline constraint. With the spatial limitation gone, far more HBM can be spread out laterally and packed into the board—several times more than today—without having to push stack heights to extremes. This means the total memory capacity and data bandwidth of the AI accelerator system would expand dramatically, on a scale incomparable to current systems. "Discussing Placing HBM Beneath the GPU"… Form Factor Could Change The industry is now producing a range of architectural design proposals over where exactly to place the HBM within the GPU board. The same memory researcher said, "Options under discussion range from broadly utilizing the space immediately around the GPU to isolating the HBM beneath the GPU board." He added, "In the latter case—isolating it beneath the GPU board—the motherboard would have to be extended lengthwise, so we're discussing even an overall form-factor change with the GPU maker." Specifically, the HBM might surround the GPU from several centimeters away, or a separate HBM zone might be created in the center of the board. "We're keeping every possibility open as we discuss the optimal layout," he said. "Nothing has been confirmed as an official roadmap yet, but as part of preliminary research toward next-generation AI accelerators, we're in talks with our partners." The outsourced semiconductor assembly and test (OSAT) industry is also watching this trend closely. An executive at a global OSAT firm said, "Optical interconnects are a clear trajectory. The only question is timing," predicting that "rack-to-rack and server-to-server links will go optical first, and then chip-to-chip connections within the board will follow." He added, "The larger units will be connected by light first, but optical research is moving so fast that it may not be that far off." Technically, the optical-interconnect technology linking GPU and HBM shares the same underlying principle as the technology connecting server to server inside a data center. The difference is the high technical barrier of shrinking optical-conversion technology—once used for communication between large pieces of equipment—down to the microscopic scale of a single board and chipset. An executive at a domestic developer of co-packaged optics (CPO) components explained, "As HBM stack heights approach their limit, the industry is discussing spreading the memory out laterally to maximize how much can physically be mounted." He added, "The principle is the same as conventional data-center optical interconnects, but HBM optical links that have to operate within a confined board space require optical components to be miniaturized to far smaller sizes and far higher integration density—so the technical difficulty is greater."

English

128

28.8K

damageboy@damageboy·1d

@MikaZomer @MeravYuval @amsterdamski2

QME

Mika Zomer@MikaZomer·1d

@MeravYuval @amsterdamski2 הדולר לא היה 4 מעל עשור.

עברית

239

damageboy@damageboy·1d

@halvarflake @danluu The flat fee is obviously based on OPM to try and get a flywheel / heroin addiction going. Seems to have worked...

English

Halvar Flake@halvarflake·1d

@danluu I think the "losing money on inference" isn't on the API side, but on the "flat fee side". There's an enormous gap on API pricing and flat-fee pricing, and I think margins are slim-positive on API and negative on flat fee.

English

396

Dan Luu@danluu·1d

Why are so many people so sure that the big AI providers are losing money on inference? It reminds me of the comments about how Uber can never make money. Their unit economics were fine and they were only losing money because they chose to do so on customer acquisition.

English

286

21K

damageboy@damageboy·1d

@halvarflake @danluu I think Anthropic projected about 40% gross profit margin in 2025 from “selling AI to businesses and application developeS,” down from an earlier ~50% estimate, because inference costs were reportedly 23% higher than expected... I assume this is API ONLY theinformation.com/articles/anthr…

English

damageboy retweetledi

y@yasha1971·2d

x.com/i/article/2057…

ZXX

6.7K

damageboy retweetledi

Mark Gadala-Maria@markgadala·2d

People are using AI to remix movies with their favorite actors. This is Commando but instead of Arnold Schwarzenegger it's Danny Davito Credit: aisnypz

English

121

394

3.1K

207.3K

damageboy@damageboy·2d

After some side by side time, I think I'm dumping Claude and embracing gpt 5.5. Whatever autistic powder they sprinkled on those gb200's, I want more of it.

English

318

damageboy@damageboy·2d

@SlexAxton @jtaby Thanks, thought there was more! Amazing!

English

Alex Sexton@SlexAxton·2d

@damageboy @jtaby It’s just requesting a patch file directly from GitHub for the demo, everything else is client side. #codeview" target="_blank" rel="nofollow noopener">diffs.com/docs#codeview

English

Alex Sexton@SlexAxton·3d

i try to only work on software where i can answer “yes” to the question “will it sandstorm?”

Pierre@pierrecomputer

diffshub[dot]com Take any public diff from GitHub and virtualize it nearly instantly, no matter how large, with DiffsHub. Built to show off our brand new CodeView component. To try it out, replace `github` with `diffshub` in your address bar.

English

4.1K

damageboy@damageboy·2d

@halvarflake Last time I did it with "babysitter" plugin github.com/a5c-ai/babysit…

English

375

Halvar Flake@halvarflake·2d

Ok folks, how do you get an agent to run against a problem for a few days? Mine always find an excuse to stop.

English

9.6K

damageboy@damageboy·2d

@SlexAxton @jtaby Is there a guide on seeing this up for private repos, or just point an LLM and pray?

English

Alex Sexton@SlexAxton·3d

Ah yea, I guess what I mean is that the thing we're showing off here is our open source components. This isn't really a product we expect people to use directly (the comments aren't stateful, etc). The hope is that all of the actual players in this space use trees and diffs from us to render so that everyone can be fast in the tools they know and love. (Diffs is already in a ton of the coding apps, codex/conductor/opencode/etc) I think more folks will adopt the pattern! It's certainly possible with the primitives today!

English

damageboy@damageboy·2d

@halvarflake It's very spiky, in my experience. And the global PR campaign isn't too dissimilar from the one behind playing the lotto, where you see a winner every week but not the million losers... Much less extreme, but same experience in principle

English

237

Halvar Flake@halvarflake·2d

The cognitive disconnect between what AI can do in blog posts, and what it fails to do in reality, is wide for me. It can do marvels, but it can also mess up so much easy stuff. Is the capability getting even spikier?

English

damageboy@damageboy·2d

@mitsuhiko @lucasmeijer Per Google SDK norms, you mean.

English

318

Armin Ronacher ⇌@mitsuhiko·2d

@lucasmeijer Because the official tooling is horseshit

English

Lucas Meijer@lucasmeijer·2d

Kind of surprised protobuf is not used much outside of google-sphere. I love how it lets/makes you turn everything into a data problem instead of api problem.

English

28.9K

damageboy@damageboy·2d

@CiItay @Eli_B_Cook אני בכלל לא מתעורר לפני נסיון דריסה הראשון של שליח בבוקר

עברית

Itay Ci@CiItay·2d

@Eli_B_Cook זו לא אותה חוויה אם לא חונים לך על המדרכה ליד המעבר חצייה

עברית

2.1K

Eli Cook@Eli_B_Cook·2d

אפשר לקנות גבעה שלמה בטוסקנה במחיר של דירת 3 חדרים בגבעתיים

Tim@TimurNegru

Someone is selling an entire hill in Tuscany. 45 hectares, for €1.3M ($1.5M). That's 111 acres of southern Tuscan countryside with a 500m² stone farmhouse on top, 9 bedrooms, 7 bathrooms, a pool, and an outdoor wood-burning oven. The estate sits at 400m altitude on a privately owned hill near Saturnia, 8 km from the famous thermal baths and 50 km from the Tyrrhenian Sea. The farmhouse was built in the early 1800s by the Piccolomini Counts as the steward's residence for what was once a much larger estate. The current 45 hectares break down as 37 ha of woodland, 7 ha of arable land, and 1 ha of olive grove with 50 trees. It borders a nature park and the Albenga River. What makes the price interesting is the land. 45 hectares fully consolidated and bordering a protected park is rare at this level in Tuscany. Most farmhouses in this price range come with 1-3 hectares of land. Here you're buying the hill itself. The trade-off is access. You're 160 km from Rome airport and 200 km from Pisa, so this isn't a fly-in-for-the-weekend kind of place. Then again, if you're buying a hill in Tuscany, being hard to reach is probably the point. How much would something like this cost where you live?

עברית

470

89.3K

Keşfet

@PnL63962200 @DanielBachmat @mathandcobb @realmemes6 @insane_analyst @NBhgdrh @talitshe @MikaZomer