TSMCCruz

3.4K posts

TSMCCruz banner
TSMCCruz

TSMCCruz

@TSMCCruz

Reality is yours to build. Opinions my own.

US TOGETHER Katılım Şubat 2024
1.5K Takip Edilen28 Takipçiler
Sabitlenmiş Tweet
TSMCCruz
TSMCCruz@TSMCCruz·
Just a reminder...
TSMCCruz tweet media
English
0
0
0
321
Chris 🇨🇦
Chris 🇨🇦@llm_wizard·
Over the last week, we had to say goodbye to the little orange menace we affectionately refered to as "The Boy". Hug your pets just a little tighter - they're too good for us.
Chris 🇨🇦 tweet media
English
10
0
30
1.3K
TSMCCruz retweetledi
DAIR.AI
DAIR.AI@dair_ai·
Are your benchmarks actually measuring the capability you think they measure? New paper says they probably not. Coined the "The Evaluation Trap", it provides a vocabulary for auditing whether your eval discriminates the underlying capability or just proxies behaviors that happen to correlate. Most benchmarks bake in implicit theory that nobody states explicitly, then evaluate as if the theory were neutral. Research indicates that most agent leaderboards are not measuring what we collectively think they are. Great read on evals, especially those making decisions on model selection. Paper: arxiv.org/abs/2605.14167 Learn to build effective AI agents in our academy: academy.dair.ai
DAIR.AI tweet media
English
4
8
51
5.4K
TSMCCruz retweetledi
NVIDIA AI PC
NVIDIA AI PC@NVIDIA_AI_PC·
Run @NousResearch's Hermes Agent fully locally on DGX Spark. 🚀 Our newest playbook shows you how to get set up via @Ollama step by step. 👇
NVIDIA AI PC tweet media
English
87
131
1.3K
234.5K
TSMCCruz retweetledi
Gauri Tripathi
Gauri Tripathi@Gauri_the_great·
dude just doesn't stop
Gauri Tripathi tweet media
English
1
11
305
5.8K
TSMCCruz retweetledi
elie
elie@eliebakouch·
Arthur Mensch answers to the french representatives: "our (mistral) models are capable of finding all the vulnerabilities found by mythos" "There are obviously people asking if they can buy us. We answer [no] because that's not our mission, and our mission is to be independent, [...] If you succeed, you don't get acquired. If you get acquired, in a way, you've failed" some numbers: > 1B R&D spend at Mistral this year > at Mistral 10% of salary mass is spent on tokens > estimates that 1 employee (in general, not at mistral) will consumes on average ~1kW in tokens per year, which is ~10k$ > 1GW datacenter is $50B capex over 5 years. you can expect to make 2x revenue. electricity captures ~10% of value. > revenue is 30% in France, rest of Europe is ~45%. public sector share is 20% with 10% in France. > a bit less than 30% of Mistral capital is held by US VCs > Mistral's goal is 1GW in 2029 > they train/will train bigger models internally and distill them to serve to customers > Mistral plays only a small part in the 35B investment (by MGX from UAE) in France, in the "campus AI" project announced at the AI summit earlier this year some of their current clusters: > 40MW in France > 25MW in Sweden > 80MW in France (next year) > they train models on "10s of MW", mention that they need the gpu to be collocated to train model > insists on the fact that EU/France advantage for building datacenters is nuclear power, which leads to less carbon footprint
elie tweet media
English
19
17
251
121.7K
TSMCCruz retweetledi
Sebastian Raschka
New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4. I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings, layer-wise attention budgets, compressed attention, and mHC. Link: magazine.sebastianraschka.com/p/recent-devel…
Sebastian Raschka tweet media
English
26
257
1.4K
60.4K
Chubby♨️
Chubby♨️@kimmonismus·
Claude is lazy, but has taste and context (no talking about 4.7 tho) Codex is eager, but still lacks some taste and context. Once Codex gets both, it’s over.
English
97
22
759
37.6K
TSMCCruz
TSMCCruz@TSMCCruz·
@jondipietronh @mcuban With the correct policy approach of the admin, I think Jon is actually correct on all points.
English
0
0
0
8
Jon DiPietro
Jon DiPietro@jondipietronh·
@mcuban 1. A free market is all the incentive they need to optimize 2. They're already incentivized to reduce energy cost 3. More money to the govt is bad 4. The govt will never use additional revenue to pay down debt - they'll spend it
English
21
0
847
31.5K
Mark Cuban
Mark Cuban@mcuban·
We should federally tax Tokens at the Provider level. Not a lot. Less than 50c per million tokens. It will accomplish 4 things (at least ) 1. It will push the big AI players to optimize tokenization, caching , routing and localization Which will 2. Reduce energy usage. Saving them in energy costs more than what they paid in tax and reducing strain created by the growth in energy consumption Which will 3. Generate maybe 10 billion dollars a year to start, but over the next ten years could grow 30x to 100x Which will 4. Create a source of funding to pay down the federal debt or deploy, in response to the things AI brings that we don’t expect or don’t like At some point the models will pass it on to customers. Of course. That’s ok. Customers will have the ability to choose between providers. Or to do everything using open source models locally. Thoughts ?
English
2.1K
250
3.9K
973K
TSMCCruz retweetledi
Dwarkesh Patel
Dwarkesh Patel@dwarkesh_sp·
In 2022, I cold emailed @bryan_caplan about how much I liked his books. I was a college Sophomore. COVID had hit - I was stuck at home and very bored. I wasn't expecting him to respond, but he wrote back an encouraging message. That gave me enough temerity to ask him to be inaugural guest of a podcast I was thinking of starting. And he gave a total rando kid a good amount of his time. Even more generously, that summer, he allowed me to join him for lunch nearly every day, and from these daily debates with him, I learned a lot about how to ask interesting questions and evaluate different viewpoints. Grateful to call Bryan a friend and a mentor.
Dwarkesh Patel tweet media
Bryan Caplan@bryan_caplan

"What a privilege it is to know that I helped the great Dwarkesh get his start! He probably would have done just as well without me, but conceivably I really was the marginal factor. Regardless, I’ve made a wonderful friend." betonit.ai/p/my-dwarkesh-…

English
22
58
2K
137.7K
Brett Adcock
Brett Adcock@adcock_brett·
You asked, we delivered: Figure livestream Merch is now live Limited quantity, get one before they sell out figure-ai.myshopify.com p.s. we run 24/7, they're $24.07 😂
Brett Adcock tweet mediaBrett Adcock tweet media
English
78
46
564
159.5K
Brett Adcock
Brett Adcock@adcock_brett·
We're live for Day 3! Watch our humanoid robots running 24/7 with full autonomy. We will be running until robot failure x.com/i/broadcasts/1…
English
139
176
1.6K
255.1K
TSMCCruz retweetledi
Ahmad
Ahmad@TheAhmadOsman·
Friends, Know this. I do everything I do because I care. From the bottom of my heart, that is what motivates me. Local AI, opensource and technology as a whole are all areas I am deeply passionate about. If you've heard me speak, you can hear it in my voice. I love helping others just as much as I love learning. This is why I also feel it's crucial to preserve the integrity of this community. If you found me putting people on blast a bit abrasive, I want you to appreciate that I am trying to protect this community. I do not want it to become like the shadier parts of crypto that many of these people came from and are now dressing up as AI experts without the requisite experience. I'm all for coming up together and lifting others. I want to help as many people as I can. That is what drives me. But in order to do that we have to be honest with each other. We have to want to provide meaningful, accurate, and actionable advice. We can't just make stuff up as we go. And we certainly can't be dishonest about our motives / getting paid to sell people on ideas that are theoretical or outright do not work. We have to keep each other in check and stay true to each other. I'm going to keep doing what I do because I love it and I want opensource and you all to thrive with me. Sincerely, Ahmad
English
32
4
213
7.8K
TSMCCruz retweetledi
Elon Musk
Elon Musk@elonmusk·
ZXX
22.4K
45.8K
443.3K
80.5M
TSMCCruz retweetledi
Sebastian Raschka
Meta observation: DeepSeek is still king of the active-parameter ratio
Sebastian Raschka tweet media
English
20
36
317
54.5K
TSMCCruz retweetledi
Nous Research
Nous Research@NousResearch·
Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.
Nous Research tweet media
English
149
418
3.7K
431.2K
TSMCCruz retweetledi
Chubby♨️
Chubby♨️@kimmonismus·
The US has cleared roughly 10 Chinese firms to buy Nvidia's H200. Alibaba, Tencent, ByteDance, JD. So far not a single chip has shipped. Until the chips actually move, the licenses work as a bargaining position rather than a finished deal. Washington keeps the H200 in reserve and can redeem it only if Beijing gives something back, on rare earths, on trade, on the tone toward Taiwan. The staging points the same way. Huang wasn't on the original delegation list. Trump invited him and picked him up in Alaska on the way to meet Xi. The CEO of the most important chipmaker is traveling as part of the leverage, not as a guest. The more interesting possibility is that the bottleneck sits in Beijing, not Washington. China has spent months pushing its champions toward domestic hardware, Huawei Ascend, homegrown clusters. Ordering 75,000 H200s would rebuild the same US dependency those firms are supposed to be shedding. The licenses may already be in hand while the Chinese buyers hold off on their own. That would explain why the limbo suits both governments. US hawks don't actually want the chips in China, and Beijing wants self sufficiency. An approval that never gets redeemed looks like progress and commits no one to anything. The number worth watching is deliveries, not approved firms. While it stays at zero, this is diplomacy dressed as commerce.
Chubby♨️ tweet media
English
22
17
203
17.7K
TSMCCruz retweetledi
Anjney Midha
Anjney Midha@AnjneyMidha·
a lot of VCs have lost touch with the youths and its starting to show i mean i'm lame and uncool too but gosh the tone deafness is next level
English
38
12
411
42.8K
TSMCCruz retweetledi
Ahmad
Ahmad@TheAhmadOsman·
Ahmad, why are you beefing? - I am not beefing, I am protecting Local AI from turning into a scam like Crypto Alex Finn used to be called NFT King I am not letting that cancer grow in this space that I so very much care about PERIOD
Ahmad@TheAhmadOsman

This is Alex Finn He’s costed so many people their hard earned money during his Mac mini grift Now he’ll reuse the same script for the DGX Spark He doesn’t know how to highlight any hardware strengths/weaknesses Zero substance or knowledge of local AI Go grift something else

English
52
14
374
18.3K
TSMCCruz retweetledi
Anjney Midha
Anjney Midha@AnjneyMidha·
hi team - what is a realistic budget needed for this? pls lmk - @amppublic is happy to sponsor a US wide program for frontier robotics in high schools
Steadtler@SteadtlerA58435

@AnjneyMidha Any serious US high school robotics team could build this if they had the budget. Much less a university.

English
28
7
164
40.4K