Raquaza

17 posts

Raquaza

Raquaza

@avilash1396

Don't just move things around...

Katılım Ağustos 2013
650 Takip Edilen32 Takipçiler
Raquaza
Raquaza@avilash1396·
@i2cjak Thanks! I am not familiar with the power supply side of analog so this was a good read!
English
0
0
0
16
i2cjak
i2cjak@i2cjak·
This is fed into an op-amp that constantly adjusts its output voltage to control the NMOS transistor (previously a 2k resistor). The op-amp uses feedback from the transistor's source, compares to the 3.3V on the positive terminal, and adjusts! The loop has been closed!
i2cjak tweet media
English
5
2
104
5K
i2cjak
i2cjak@i2cjak·
"haha stupid electrical engineers with their LDOs and bucks. Look how easy it is for me to get 3.3V out of 5V!" - ignorant sw"e" Linear regulators are, honestly, a bit like a resistor divider. However, they have to have some sort of feedback. Look what happens if we DON'T >>>
i2cjak tweet media
English
25
24
543
31.7K
Steve the Beaver
Steve the Beaver@beaversteever·
why do we call it a GPU if it's just a software defined math device
English
13
1
66
9.5K
Raquaza
Raquaza@avilash1396·
@BrianRoemmele 8/ Lastly, the gain cell array is beneficial only with large parallelism. Since you're trying to make it, try finding an IC vendor who provides RRAM memories. Making this on PCB, which I am guessing since you're making spice sims, loses on power, performance and area. Good luck.
English
0
0
0
7
Raquaza
Raquaza@avilash1396·
@BrianRoemmele 7/ Will Nvidia try to make something that saves them 30-50% of the energy? Unlikely. But there are AI chip startups that are targeting using in-memory compute. They are doing good work. Some even have products out.
English
1
0
0
8
Brian Roemmele
Brian Roemmele@BrianRoemmele·
BOOM! MAJOR AI SPEEDUP! Hot Rod AI 100 times faster inference 100,000 times less power! — Reviving Analog Circuits: A Leap Toward Ultra-Efficient AI with In-Memory Attention I got my start in analog electronics when I was a kid and always thought analog computers would make a comeback. Analog computing of the 1960s neural networks used voltage-based circuits rather than binary clocks. Analog is Faster Than Digital Large language models at their core lies the transformer architecture, where self-attention mechanisms sift through vast sequences of data to predict the next word or token. On conventional GPUs, shuttling data between memory caches and processing units devours time and energy, bottlenecking the entire system. They require a clock cycle to precisely move bits in and out of memory and registers and this is >90% of the time and energy overhead. But now a groundbreaking study proposes a custom in-memory computing setup that could slash these inefficiencies, potentially reshaping how we deploy generative AI. The innovation centers on "gain cells"—emerging charge-based analog memories that double as both storage and computation engines. Unlike digital GPUs, which laboriously load token projections from cache into SRAM for each generation step, this architecture keeps data where the math happens: right ON THE CHIP! With a clock speed near the THE SPEED OF LIGHT because it is never on/off like in digital binary. By leveraging parallel analog dot-product operations, the design computes self-attention natively, sidestepping the data movement that plagues GPU hardware. To bridge the gap between ideal digital models and the noisy realities of analog circuits, the researchers devised a clever initialization algorithm. This method adapts pre-trained LLMs, such as GPT-2, without the need for full retraining, ensuring seamless performance parity despite non-idealities like voltage drifts or precision limits. The results are nothing short of staggering! Simulations show the system slashing attention latency to 100 times faster inference for token generation—while curbing energy use by a jaw-dropping five orders of magnitude, or 100,000 times less power-hungry than GPU baselines. For context, this could mean running a full LLM on a device no larger than a a card deck, without any thermal throttling or grid-straining demands of today's data centers. The approach targets the attention block specifically, the transformer’s energy hog, but slso broader integration with other in-memory techniques to turbocharge the entire model pipeline. Analog tech isn't pie-in-the-sky quantum wizardry; it's grounded in ancient mature electronics theory, with gain cells already prototyped in labs. The only engineering issue, and it is simple: tolerances for noise, scaling arrays of cells, and fabricating at microchip densities. Existing CMOS processes tweaks for analog fidelity. From there, Full ecosystem integration, including software stacks for model adaptation, could happen in a year, disrupting GPU dominance sooner than skeptics predict. Risks are low but hybrid digital-analog interfaces could introduce unforeseen bugs. However this can be rapidly iterated and addressed. This isn't just hardware tinkering; it's a philosophical pivot back to AI's analog origins, where computation flows continuously rather than ticking in discrete cycles. This in-memory attention could democratize AI power, making low power, lightning-fast AI not a luxury, but an inevitability to even the smallest devices. Most have no idea how big this is: It is the biggest shift in AI since the invention of LLMs. The world will struggle to find true experienced analog engineers, most are gone. In my garage I will have a test Analog CMOS Gain Cells using off the self parts in the next few days, if Radio Shack was still around I would have have done today. I suspect I can scale to a proto AI model in a few weeks. PAPER: arxiv.org/abs/2409.19315
Brian Roemmele tweet media
English
121
216
1.2K
475.3K
Raquaza
Raquaza@avilash1396·
@bubbleboi Too late imo, most companies have already shifted large stacks of physical design to India, and it will continue. Architecture and design is still being done in US, and currently digital design is slowly being outsourced.
English
0
0
0
32
Raquaza
Raquaza@avilash1396·
@bubbleboi Fast chips hit the TDP limit, so there could only be a single core that runs that fast. Parallelism is more important now. Also, I think you're referring to only computer architecture when you mention those schools. Not the full picture.
English
0
0
0
34
bubble boi
bubble boi@bubbleboi·
Stanford & Berkeley are like the only universities left in the world that can design fast chips. There is also like this one professor at Yale who can too but it’s just him he’s the whole ECE department. It’s a real problem in my eyes because we need to train the next generation of computer engineers to make a ton of chips. And outside of Broadcom, Intel, & Marvell very few are capable of the physical design skills necessary to make fast chips. In the next twenty years it’s possible that physical design will be so automated but still mediocre that we are just stuck with mid basic ass chips. It keeps me up at night I’ll tell you.
English
69
85
1.6K
164.8K
Raquaza
Raquaza@avilash1396·
@george__mack This is incorrect. The outcome heavily relies on the actions taken. One can be a pessimist yet still take the correct actions to succeed.
English
0
0
0
3
George Mack
George Mack@george__mack·
Peter Thiel on Pessimism. Gold.
George Mack tweet media
English
675
1.5K
14.1K
5M