Bob Komin

359 posts

Bob Komin banner
Bob Komin

Bob Komin

@BobKomin

Bay Area/ Bend, OR / Cabo Katılım Ocak 2009
1.1K Takip Edilen448 Takipçiler
Bob Komin retweetledi
Deirdre Bosa
Deirdre Bosa@dee_bosa·
"One of the biggest misconceptions" Cerebras CFO @BobKomin pushes back on the small-models narrative. "We serve all models, and there is no limit to the size of the models that we can serve. Today, we're serving trillion parameter models. We're serving trillion parameter models that are internal for OpenAI today. We are currently running OpenAI 5.4 and 5.5 with them."
English
6
13
123
83K
Bob Komin
Bob Komin@BobKomin·
Thank you for having me on the program on IPO day to talk about It and @cerebras
Deirdre Bosa@dee_bosa

Cerebras CFO @bobkomin on concentration risk: "How do you build up a big company? Start with one large customer, add another, add another, and we're on that trajectory right now." "Now we have an over $20 billion customer. Of course, they're going to be a high percentage of our revenue. But the second step out was to get a foundation model lab using our hardware, and then the third step is our first hyperscaler customer with AWS, and we haven't even gotten started with them."

English
1
0
4
378
Bob Komin retweetledi
Deirdre Bosa
Deirdre Bosa@dee_bosa·
Cerebras CFO @bobkomin on concentration risk: "How do you build up a big company? Start with one large customer, add another, add another, and we're on that trajectory right now." "Now we have an over $20 billion customer. Of course, they're going to be a high percentage of our revenue. But the second step out was to get a foundation model lab using our hardware, and then the third step is our first hyperscaler customer with AWS, and we haven't even gotten started with them."
English
3
6
48
38.4K
Bob Komin retweetledi
Andrew Feldman
Andrew Feldman@andrewdfeldman·
What is semiconductor yield? How does it work? Why did it define the semiconductor industry for 70 years? How did this problem get solved? And how does this impact developers? What Is Semiconductor Yield? When you manufacture chips, not every one comes out working. Some have defects. “Yield” is the percentage of chips from a manufacturing run that actually work. If you make 100 chips and 90 work, your yield is 90%. How Does Yield Work? Chips are made from silicon wafers - thin, circular discs about 12 inches in diameter. In a perfect world, every square millimeter of a wafer would be flawless. But that never happens. Every wafer has tiny random defects scattered across it. Chips are cut from these wafers. And any chip that lands on a defect is thrown away. The process of chip manufacturing looks a lot like your mother making cookies. Imagine your mom rolled out a circle of cookie dough 12 inches in diameter. Then when she wasn't looking, your brother threw a handful of peanut M&Ms into the air and they landed at random on the dough. Those M&Ms are flaws. Nobody can eat a cookie with a peanut M&M in it. So she has to throw away every cookie that has one. Now she gets out a small cookie cutter and stamps out cookies. Because the cookie cutter is small, the probability of hitting an M&M is low. And when a cookie does have one, there isn't much good dough surrounding it. Not much good dough is thrown away. The result: a lot of good cookies. They are small but there are a lot of them. On the other hand, if she uses a big cookie cutter, the probability of hitting an M&M is much larger. And when she throws that cookie away, she throws away a lot of good dough with it. The result: only a few cookies. They are big, but the 12 inch diameter circle of dough yielded only a few. This is exactly how chip manufacturing works. The cookie dough is a silicon wafer. The cookies are chips. Peanut M&Ms are flaws (because they are gross) Bigger chips hit more flaws. More good silicon gets thrown away. Smaller chips, like smaller cookies, are less likely to hit flaws. And when they do, less silicon is discarded. This is why big chips are disproportionately more expensive. This is also why people assumed that because there was no way to make a wafer without flaws, there was no way to make a chip the size of a wafer. Why Did This Define The Industry For 70 Years? In an ideal world, you'd build really big chips for many data center applications. Data moves incredibly fast on-chip. So if you keep the data and compute on-chip, your work takes less time, and uses less power. In AI, that manifests as super fast inference. But the moment data has to leave one chip and travel to another - through cables, switches, connectors, circuit boards - it slows down and uses more power. Lots of off-chip communication slows work, and, in AI, produces slow inference. Though everyone agreed they were faster, nobody could yield big chips. So the industry settled on a workaround: don't build one big chip. Build thousands of small ones and wire them together. Most AI data centers are built this way today. Thousands of little GPUs connected by cables, switches, and networks. It works. But you pay a price. Every connection adds latency. Every cable adds overhead. Every hop between chips slows things down. For 70 years, everyone accepted this as the only way. How Did Cerebras Solve the Yield Problem? In 2019, we solved the yield problem at @cerebras and brought the first wafer sized processor, wafer scale processor, to market. How did we do that? The answer came from studying a different kind of chip entirely. Memory. Memory is built with a different process. Memory chips are made up of millions of identical tiles, with redundant tiles woven throughout. In a memory chip, if a tile has a flaw in it, the chip doesn't get thrown away. The bad tile is shut down and one of the redundant ones is called into action. Memory chips weren't designed to avoid flaws, but rather to withstand them. They use redundancy to withstand flaws. And their yield is extraordinary. Our founders realized that if we could develop a compute architecture that looked like memory, that was built of hundreds of thousands of identical tiles, we too could use redundancy to withstand flaws. We could fail in place, and route around the failed tile, just as they do in memory (and interestingly as they do in data centers where they fail in place, route around, and keep going). This would enable us to yield a wafer scale processor. And today we are happy to compare our yields to GPUs, that are 1/58th our size. How Does This Impact Developers? The impact is simple and easy to see. Cerebras wafer scale processors are up to 15 times faster than @nvidia GPUs. And when your AI is fast, people use it more often, stay longer, and use it to solve more interesting problems.
Andrew Feldman tweet media
English
5
27
171
14.4K
Bob Komin retweetledi
Amazon Web Services
Amazon Web Services@awscloud·
We're teaming up with @cerebras to build the fastest possible inference. Coming soon to Amazon Bedrock, we’re delivering inference performance an order of magnitude faster than what’s available today by connecting AWS Trainium3 for compute-intensive prefill with Cerebras CS-3 to power decode. Learn more about the partnership. go.aws/3Pzcota
English
17
49
341
204.2K
Bob Komin
Bob Komin@BobKomin·
Thank God @Starlink is finally an alternative to monopoly cable & internet provider @Xfinity after 18 years! This cancellation process is reason enough to quit besides the poor service quality & needs to literally be outlawed. If I can sign up and upgrade in a click I should be able to do the same and cancel. You have to schedule a callback, waste 12 minutes in an horrible process while a poor rep has to input info the whole time and ask a bunch of useless questions and gets yelled out. Truly the worst customer and employee experience ever.
English
0
0
1
103
Bob Komin retweetledi
Cerebras
Cerebras@cerebras·
Cerebras tweet media
ZXX
8
7
101
6K
Bob Komin retweetledi
Cerebras
Cerebras@cerebras·
If you think Cerebras is just about speed, you do not understand Cerebras. Just as mass can be converted to energy, speed can be converted to intelligence. It's the natural consequence of test-time compute scaling.
Cerebras tweet media
English
40
43
825
105.8K
Bob Komin retweetledi
Cerebras
Cerebras@cerebras·
🔧 Just deployed additional speed optimizations for @cognition SWE-1.5. The fastest measured request was an eye watering 1,881 token/s per Grafana dashboard. 🤯🤯
Cerebras tweet media
English
15
20
380
48.5K
Bob Komin retweetledi
Cerebras
Cerebras@cerebras·
Cerebras beats Nvidia H100 but can it beat Blackwell? Blackwell inference endpoints are finally out and it’s fast. It runs GPT-OSS-120B at ~700 tokens/s, leapfrogging H100 and Groq. Cerebras clocked in at 3,000 TPS - still #1. Looking forward to Rubin!
Cerebras tweet media
English
34
27
384
33.9K
Bob Komin retweetledi
Cerebras
Cerebras@cerebras·
Cerebras is proud to announce our record $𝟭.𝟭𝗕 𝗦𝗲𝗿𝗶𝗲𝘀 𝗚 𝗮𝘁 𝗮𝗻 $𝟴.𝟭𝗕 𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻. As builders of the world’s fastest AI infrastructure, we’re scaling to meet explosive demand. Funding will accelerate: 🚀 Breakthrough 𝗔𝗜 𝗽𝗿𝗼𝗰𝗲𝘀𝘀𝗼𝗿 𝗱𝗲𝘀𝗶𝗴𝗻, 𝗽𝗮𝗰𝗸𝗮𝗴𝗶𝗻𝗴, 𝘀𝘆𝘀𝘁𝗲𝗺𝘀, 𝗮𝗻𝗱 𝘀𝘂𝗽𝗲𝗿𝗰𝗼𝗺𝗽𝘂𝘁𝗲𝗿𝘀 🚀 Expansion of 𝗨.𝗦. 𝗺𝗮𝗻𝘂𝗳𝗮𝗰𝘁𝘂𝗿𝗶𝗻𝗴 𝗰𝗮𝗽𝗮𝗰𝗶𝘁𝘆 and 𝗴𝗹𝗼𝗯𝗮𝗹 𝗱𝗮𝘁𝗮 𝗰𝗲𝗻𝘁𝗲𝗿 footprint 🚀Growth of 𝗖𝗲𝗿𝗲𝗯𝗿𝗮𝘀 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗮𝗻𝗱 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 to meet explosive demand The round was led by 𝗙𝗶𝗱𝗲𝗹𝗶𝘁𝘆 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁 & 𝗥𝗲𝘀𝗲𝗮𝗿𝗰𝗵 𝗖𝗼𝗺𝗽𝗮𝗻𝘆 and 𝗔𝘁𝗿𝗲𝗶𝗱𝗲𝘀 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁, with participation from 𝗧𝗶𝗴𝗲𝗿 𝗚𝗹𝗼𝗯𝗮𝗹, 𝗩𝗮𝗹𝗼𝗿 𝗘𝗾𝘂𝗶𝘁𝘆 𝗣𝗮𝗿𝘁𝗻𝗲𝗿𝘀, 𝟭𝟳𝟴𝟵 𝗖𝗮𝗽𝗶𝘁𝗮𝗹, 𝗔𝗹𝘁𝗶𝗺𝗲𝘁𝗲𝗿, 𝗔𝗹𝗽𝗵𝗮 𝗪𝗮𝘃𝗲, and 𝗕𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸. 👊 Let's go! cerebras.ai/press-release/…
English
26
38
483
30.8K