Gulaid

2.3K posts

Gulaid banner
Gulaid

Gulaid

@guled_29

Researcher

SF Katılım Aralık 2008
2.2K Takip Edilen1.2K Takipçiler
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Why don’t LLM’s just tell you when you are asking a question / doing something that is out of distribution?
English
313
61
2K
229.6K
Gulaid retweetledi
Sergey Nazarov
Sergey Nazarov@sergeynazarovx·
We used to go to a special website, ask strangers for help with programming, and get humiliated in return
Sergey Nazarov tweet media
English
304
3.5K
39.5K
873.6K
Gulaid
Gulaid@guled_29·
@NBA__Courtside This guy cares about his lil podcast than not making to the playoff.
English
0
0
3
3.6K
NBA Courtside
NBA Courtside@NBA__Courtside·
Draymond Green on him and Devin Booker getting ejected: “So, obviously Devin Booker and I got ejected. Devin Booker told Scott Foster that I punched him in the stomach, which is a lie. Number one, a punch is with a close fist. Number two, it’s very obvious I was taking a foul. So, Scott Foster goes to review it, realizes I didn’t punch him in the stomach, and he call an away from the play foul. Guy’s passing the ball. So I’m not sure how it’s away from the play, but I think that’s Scott Foster’s way of well, there’s not really much here for me to call a flagrant foul or a technical foul, but I feel I need to do something. And so he calls it away from the play foul, which to me made absolutely no sense. However, I was standing on the sideline and Book was shooting a free throw and I was saying, “Yo, Book, I punched you.” And I just kept saying, “So, I punched you, you know, and he didn’t want to answer.” And so I kept asking him and then finally he was like, “Uh, man, I said I moved on, Dray. I ain’t know. I said I moved on.” And I asked again, so I punched you, you know, he got a little testy and started making some comments. So, I made some comments back to him, but make no mistake about it, man. Book, that’s my young fella. Always have been. That relationship stems back to Book being in high school. And ain’t nothing going to change that relationship. We had that moment. What really pissed me off was we got ejected. We weren’t cursing at each other or anything. We were talking back and forth. And as you can see by AD prank show, I don’t do well with like shut up techs, you know, like I’m going to give you a tech. So you guys have to stop talking. That sh*t don’t work for me.” (Via @DraymondShow)
NBA Courtside@NBA__Courtside

Draymond Green on his future with the Warriors: “I hope it’s not the end… I have a player option coming up this summer and I don’t know what’s going to happen… this year was a little bit different from me even going through the trade talks at the deadline. Not quite knowing whether I’ll get traded or not. I’ve never gone through that before…. Going into the summer, I don’t know. We’ll see. Ultimately I think I’d love to finish in the place I started, but there must be things held up on both ends” (Via @DraymondShow)

English
126
114
4K
968K
Gulaid
Gulaid@guled_29·
@firstadopter maybe you should start a podcast too... insightful Substack!
English
0
0
1
1.3K
tae kim
tae kim@firstadopter·
Everyone should read what's below. This is why actually knowing your stuff instead of naively regurgitating a particular startup's marketing propaganda bullet points is important. I've also included a screenshot of my Substack writeup of Nvidia's Bill Dally and Google's Jeff Dean GTC session that confirms Gavin's analysis.
tae kim tweet mediatae kim tweet mediatae kim tweet media
Gavin Baker@GavinSBaker

Much of Dwarkesh's argument hinges on this statment which *was* accurate but will be increasingly inaccurate on a go forward basis imo:    “American labs port across accelerators constantly. Anthropic's models are run on GPUs, they're run on Trainium, they're run on TPUs. There are so many things you can do, from distilling to a model that's well fit for your chips.”   As system level architectures diverge (torus vs. switched scale-up topologies, memory hierarchies, networking primitives), true portability is eroding. The Mi300 and Mi325 had roughly the same scale-up domain size as Hopper while Blackwell’s scale-up domain is 9x larger than the Mi355 scale-up domain, etc. Many frontier models are now being explicitly co-designed for inference on specific hardware like GB300 racks. Codex on Cerebras is another example. Those models run less efficiently on other systems and the performance differentials will only widen. A model that runs well on Google’s torus topology will run less efficiently on Nvidia’s switched scale-up topology and vice versa - the data traffic is fundamentally different as a byproduct of the models being parallelized across the different topologies. Google’s internal teams - and increasingly the Anthropic teams as they become the most important customer of almost every cloud - have the luxury of operating across the stack (models, chips, networking) - but that is not the case for the rest of the market and other prospective users. Anthropic is the exception, not the rule. To wit, Anthropic and Google allegedly have a mutual understanding where Anthropic can hire the TPU engineers they need every year to ensure that they can continue to get the most out of the TPU. Given the overwhelming importance of cost per token to the economics of the labs, models will be run where they run best. Most extremely large MoE models will run best on GB300s given the importance of having a switched scale-up network like NVLink for MoE inference. When training was the dominant cost for labs and power was broadly available, labs were optimizing to minimize capex dollars. Model portability was a way to create leverage over suppliers. I think that drove a lot of the focus on portability. Today, inference costs as measured by tokens per watt per dollar are everything. Inference is way more important than training costs (inference is effectively now part of training via RL). Labs are therefore now optimizing for inference. This means increasing co-design and higher go-forward switching costs for individual models between systems. I do think this explains why Anthropic and Nvidia came together: Anthropic needed Blackwells and Rubins to inference at least *some* of their models economically. And Mythos might just end up being released coincident with the availability of Rubins for inference. TLDR: as labs shift their focus from training to inference, the costs of portability and the upside of co-design to maximize tokens per watt per dollar both rise. Portability is likely to begin decreasing as a result.   I think what I might have respectfully added to Jensen’s answer is that systems evolve under local selective pressures. The evolutionary pressure in America is a shortage of watts so it makes sense for Nvidia to optimize, as an American company, for power efficiency and tokens per watt and stay on copper as long as possible. China has a surfeit of watts. Chinese AI systems are already taking advantage of this with the Huawei Cloudmatrix 384 and Atlas SuperPoD having an optical scale-up domain that is much larger than anything offered by Nvidia today at the cost of *much* higher power consumption and much lower tokens per watt. The networking primitives for this Huawei system are very different than those for Nvidia’s systems and a model that runs well on Nvidia will not run well on that system and vice versa. This means that if a Chinese ecosystem gets momentum, Chinese models might stop running well on American hardware. And when Chinese models run best on American hardware, America is in a better position as this gives America a degree of leverage and control over Chinese AI that it risks losing to an all-Chinese alternative ecosystem.   This architectural fork makes porting and distillation less effective and strengthens the pro-American national security case for selling China deprecated GPUs imo. Also I will attest that I did not wake up a loser this morning.

English
12
54
489
310.3K
Gulaid
Gulaid@guled_29·
@RobBfromDerby We better listen him, man! he just crazy, enough to do it!
English
0
0
0
306
Rob B
Rob B@RobBfromDerby·
“Open the Strait of Hormuz or I’m closing the Strait of Hormuz”
Rob B tweet media
English
741
17.7K
146.8K
2.4M
safia aidid
safia aidid@safiyaaaay·
some news 💕
safia aidid tweet mediasafia aidid tweet media
English
66
49
605
13.9K
Joe Weisenthal
Joe Weisenthal@TheStalwart·
The critics who talk about LLM's hallucination problem are, IME, completely correct. Other than for coding, I just use the chatbots as glorified search engines, and I don't think you should trust a word they say if you can't find a link to back it up.
English
139
126
2.3K
334.2K
Gulaid
Gulaid@guled_29·
@TheStalwart I hope this is an experiment to measure meaningful productivity and wasted tokens. otherwise, it does not make sense.
English
0
0
0
116
Austrolibertaria
Austrolibertaria@AustroLibertari·
Coming to Persian movie theaters in 2028:
Austrolibertaria tweet media
English
3
12
64
5.7K
Gulaid retweetledi
NASA
NASA@NASA·
It’s not a straight shot to the far side of the Moon! 🌕 Over approximately 10 days, the Artemis II astronauts will orbit Earth twice before looping around the far side of the Moon in a figure eight and returning home.
English
3.4K
20.9K
144.3K
10.2M
Dylan Abruscato
Dylan Abruscato@DylanAbruscato·
TBPN will remain an independent platform for founders to share news, launch products, and interact with the most engaged audience in tech. We’ll maintain full editorial control while working with OpenAI to scale the show’s reach and production. See you at 11a PT, every weekday.
English
33
9
230
110K
Dylan Abruscato
Dylan Abruscato@DylanAbruscato·
OpenAI is acquiring TBPN This has been a dream job and the show only gets better from here
Dylan Abruscato tweet media
English
165
72
1.2K
185K
John Coogan
John Coogan@johncoogan·
TBPN has been acquired by OpenAI! The show is staying the same and we’ll continue to go live at 11am pacific every weekday. This is a full circle moment for me as I’ve worked with @sama for well over a decade. He funded my first company in 2013. Then helped us fix a serious logjam during a critical funding round a few years later. When I took my second company through YC, he was president at the time, and then when I joined Founders Fund, the first deal I saw in motion was the post-ChatGPT round in late 2022. And as we started growing TBPN last year, he was the very first lab lead to join the show. Thank you to everyone that has been a part of TBPN until now. The last year has been the most fun and rewarding part of my career and we’re excited to have more resources than ever going forward.
English
1.3K
414
8.8K
3.1M
Gulaid
Gulaid@guled_29·
@TheStalwart @Havelock_AI He should have done a tiktok video instead. Most Americans read at 7-8th grade level according to the the national literacy institute.
English
0
0
5
545
Joe Weisenthal
Joe Weisenthal@TheStalwart·
Just 31%. Iranian President Masoud Pezeshkian communicates to the American people in a very literate register, according to @Havelock_AI
Joe Weisenthal tweet mediaJoe Weisenthal tweet mediaJoe Weisenthal tweet media
English
20
23
287
105.6K
Gulaid
Gulaid@guled_29·
@TechNerd4everr They are playing catchup.. perplexity is only 400 members team, and they not financially super profitable. it makes sense.
English
0
0
0
20