Arin

357 posts

Arin

@4G142

Catching my pokemons

Beigetreten Aralık 2023

2K Folgt38 Follower

Arin retweetet

elie@eliebakouch·19 Mar

beautiful

jianlin.su@Jianlin_S

Attention Residuals Revisited kexue.fm/archives/11664

English

763

51.4K

Arin@4G142·1 Ara

Written by gpt... didn't even have to check.

Divya Gandotra Tandon@divya_gandotra

When a respected ASI veteran like Prof. K.K. Muhammad a man who has spent decades preserving India’s archaeological record calls the last 11 years “the darkest age” for the ASI, it cannot be brushed aside as political commentary. It forces tougher questions: • How did an institution once known for scientific rigor fall into alleged bureaucratic paralysis? • Why are key excavations, conservation projects, and heritage mappings delayed or abandoned? • Has archaeology a discipline meant to protect India’s civilizational memory become collateral in political battles? • Who is accountable when national heritage is reduced to headlines instead of research? Heritage is not Left vs Right. It’s India’s legacy and its loss will cost every generation.

English

Arin@4G142·2 Kas

@CANehaNiharika @congratstothe

QAM

CA Neha Niharika@CANehaNiharika·1 Kas

Met a 45-year-old techie today. He graduated at 22 and began a tiny SIP from his very first salary. No push from parents, employer or anyone. He shared that a mutual fund advisor in his society came to him with this idea, he simply acted. 22+ years later, today, the corpus is beyond what he imagined. He smiled and said, “I’m never stopping this. It’s been the best decision of my life.” That’s the beauty of SIPs: effortless wealth building. You set up an auto-debit, start small, and let time do the hard work . It never feels like a burden. And if you start early, this money is not for a distant future or next generation only. It’s money you can use in this life for real goals: a better home, children’s education, a sabbatical, a dream project…on your terms. If you’re in your 20s or 30s, copy this playbook: start with ₹2000–₹5,000 (whatever is comfortable), automate it, stay consistent, and top up when income grows. India’s growth runway + your discipline could be your path to financial freedom. In a few years the SIP amount will feel tiny, but the habit will feel priceless. #SIP

English

901

190.9K

Arin@4G142·2 Kas

@thundermeowx @robmooreEsports yeah i see what u did there

English

508

thundermeow@thundermeowx·2 Kas

@robmooreEsports young core group of players the way to go...that's how dynasties are built

GIF

English

197

34.7K

Rob Moore@robmooreEsports·2 Kas

We will all miss Zekken , Zellsis and Bang. But we are now locked in for the future. Everyone signed up through 2028. With a bunch of young players there are bound to be some bumps along the way but you can be there for all of it starting Thursday with the Sen City Classic.

Sentinels@Sentinels

To new beginnings

English

114

6.1K

1.4M

Arin@4G142·7 Eyl

???????

ZXX

Arin retweetet

ℏεsam@Hesamation·6 Tem

This is a playlist of 9 AI & ML youtube videos you can’t miss as an AI engineer, It’s 50+ hours of technical hands-on courses: 1. Neural Networks Zero to Hero (Karpathy) lnkd.in/gd2JZ5Wt From micro-gradients to nanoGPT, code-first all the way. 2. Stanford CS336 (2025): Language Modelling from Scratch lnkd.in/gzW4JkW9 A full-stack LLM bootcamp: data → training → serving → evaluation. 3. MIT 6.S191 (2025): Intro to Deep Learning lnkd.in/gNGV2kbB Transformers, diffusion, and modern DL in under 2 hours. 4. CS25: Intro to Transformers with Karpathy lnkd.in/ghmYwDQB Turns “Attention Is All You Need” into code you can actually deploy. 5. Stanford CS229 Guest Lecture: Building LLMs lnkd.in/gNsyZbX7 Behind the curtain of Stanford’s 2025 LLM stack. 6. Deep Dive into LLMs like ChatGPT lnkd.in/gwkSpfxQ 3.5 hours of how GPTs really work under the hood. 7. Let’s Build GPT from Scratch lnkd.in/guWVGR3C 200 lines of Python → a functional GPT. Watch, code, repeat. 8. Agentic AI by Stanford lnkd.in/dh3c53Yw Gain an introduction to the concept of agentic AI language. 9. Transformers and Self-Attention lnkd.in/d7MAUS-P Introduction to the Transformers architecture from scratch Hats off to @ordax for the list!

English

386

2.3K

170.4K

Arin@4G142·23 Mar

who cares about checking github commit history typical indian aunty behaviour. Noone cares focus on yourself and stop posting about others dumbass

Dilpreet Grover@dfordp11

India's top GitHub committer ka commit history 🤡 Iska kitne ka package lagega, write in comments

English

Arin@4G142·23 Mar

@beffjezos Boing Boing

English

140

Beff (e/acc)@beffjezos·23 Mar

When you go to the function with "it's time to build" vibes but all you have to work with are forks and spoons: x.com/CH3F_X_ETH/sta…

English

342

29.9K

Arin@4G142·12 Mar

@ananyachdh Hey congrats on the launch i would love to try

English

199

Ananya Chadha@ananyachdh·12 Mar

It's official — we are excited to launch Quander.ai in the world. GENERATE A 1-MINUTE MOVIE FROM A MINOR PROMPT. Our AI agent assembles the entire video for you, in your video timeline, and you can manually make changes, if you’d like. It's incredible to see our first users have made videos with their friends as main characters, product ads, faith stories, music videos, fanfictions, book trailers, and more. 🧵 Try it today @quanderAI. If you’ve made it here, we have free credits for you 👇

English

682

228

2.4K

300.7K

Arin retweetet

Quentin Bertrand@Qu3ntinB·4 Mar

Before diving deep into diffusion with these Lectures you can start with a friendly blog post on flow matching :) dl.heeere.com/conditional-fl…

Peter Holderrieth@peholderrieth

Our MIT class “6.S184: Introduction to Flow Matching and Diffusion Models” is now available on YouTube! We teach state-of-the-art generative AI algorithms for images, videos, proteins, etc. together with the mathematical tools to understand them. diffusion.csail.mit.edu (1/4)

English

177

14K

Arin@4G142·18 Şub

@wanshunzhi what is TH team card doing in hand of nobody

English

378

🥛@wanshunzhi·17 Şub

EDG did a tier list of Masters Bangkok teams: CHICHOO: “T1 did a reshuffle and this time they have some players that I quite like. I’ve said before that I really like stax’s personality… plus their results [when they were] on DRX before were so crazy and now they added Meteor.”

English

548

43.7K

Arin@4G142·12 Şub

@100T_VAL @Boostio would be good if y'all played that game too fkin shit org

English

412

100 Thieves VALORANT@100T_VAL·11 Şub

@Boostio NRG already playing musical chairs 😭😭

English

535

25.6K

100T Boostio@Boostio·11 Şub

I got farmed by a benchwarmer 😭😭

NRG@NRGgg

level 99 farming

English

496.3K

Arin retweetet

Odraputra Prataparudra Jagannatha Dasa@KalingaGajapati·8 Şub

How to escape India while living in India ? 1. Always travel in flights unless its too expensive 2. Live in a gated community 3. Buy groceries from supermarket 4. Do not munch street food Add more

English

869

897

15.5K

1.5M

Arin retweetet

malone@malonehedges·8 Şub

unofficial guide to nyc's AI underground, aka places i wish i knew sooner - latent space: former bank vault hosting banned model training & token parties - neural.house: communal living for prompt engineers w/ weekly completion battles - gradient gallery: chinatown basement running autonomous art collectives - tensor tea house: invite-only spot for model merging & midnight inference sessions - the null: dumbo warehouse for generative art & rogue agent experiments

ling ✧@fishlooker_

unofficial guide to nyc’s creative underground, aka places i wish i knew sooner - nyc resistor: og hacker space for soldering to fiber arts - hex house: immersive art + tech playground - telos.haus: creative clubhouse w/ workshops & exhibitions - heart 442: artist-run soho spot for internet culture - fractal nyc: communal living, self run courses and psychedelic basement parties

English

764

147.7K

Arin retweetet

Cheng Lou@_chenglou·7 Şub

Worst thing to do when you're full of agency is to wake up and browse Twitter

English

115

443

7.9K

255.3K

Arin retweetet

Orange Book 🍊📖@orangebook·28 Oca

Not having a clear deadline is the most common way to waste many years.

English

116

1.8K

16K

391.3K

Arin retweetet

Yishan@yishan·28 Oca

I think the Deepseek moment is not really the Sputnik moment, but more like the Google moment. If anyone was around in ~2004, you'll know what I mean, but more on that later. I think everyone is over-rotated on this because Deepseek came out of China. Let me try to un-rotate you. Deepseek could have come out of some lab in the US Midwest. Like say some CS lab couldn't afford the latest nVidia chips and had to use older hardware, but they had a great algo and systems department, and they found a bunch of optimizations and trained a model for a few million dollars and lo, the model is roughly on par with o1. Look everyone, we found a new training method and we optimized a bunch of algorithms! Everyone is like OH WOW and starts trying the same thing. Great week for AI advancement! No need for US markets to lose a trillion in market cap. The tech world (and apparently Wall Street) is massively over-rotated on this because it came out of CHINA. I get it. After everyone has been sensitized over the H1BLM uproar, we are conditioned to think of OMG Immigrants China as some kind of Alien Other. As though the Alien-Other Chinese Researchers are doing something special that's out of reach and now China The Empire is somehow uniquely in possession of Super Efficient AI Power and the US companies can't compete. The subtext of "A New Fearsome Power Now Under The Command of the CCP" is what's driving the current sentiment, and it's not really valid. Like, no. These are guys basically working on the same problems we are in the US, and not only that, they wrote a paper about it and open-sourced their model! It is not actually some sort of tectonic geopolitical shift, it is just Some Nerds Over There saying "Hey we figured out some cool shit, here's how we did it, maybe you would like to check it out?" Sputnik showed that the Soviets could do something the US couldn't ("a new fearsome power"). They didn't subsequently publish all the technical details and half the blueprints. They only showed that it could be done. With Deepseek, if I recall correctly, a lab in Berkeley read their paper and duplicated the claimed results on a small scale within a day. That's why I say it's like the Google moment in 2004. Google filed its S-1 in 2004, and revealed to the world that they had built the largest supercomputer cluster by using distributed algorithms to network together commodity computers at the best performance-per-dollar point on the cost curve. This was in contrast to every other tech company, who at that time just bought what were essentially larger and larger mainframes, always at the most expensive leading edge of the cost curve. (To the young people reading this, this will sound incredible to you) I worked at PayPal at the time, and in order to keep pace with the rising transaction volume, the company was forced to buy bigger and bigger database servers from Oracle. We were totally Oracle's bitch. At one point when we ran into scalability issues, the Oracle reps told us we were their biggest installation so they had no other reference point on how to help us overcome our scalability issues. We literally resorted to flipping random config switches and rebooting it. (This heavily influenced me when I was a young manager later at Facebook. I deliberately torpedoed an Oracle salesman's pitch to try and get us to switch from open source MySQL databases to an Oracle contract: of course we had scalability problems, but at least when we had them, we could open up the hood and figure out how to fix it ... assuming we had good enough engineers, and we did. When it's closed-source infra, you're at the mercy of the vendor's support engineers) Back to Google - in their S-1, they described how they were able to leapfrog the scalability limits of mainframes and had been (for years!) running a far more massive networked supercomputer comprised of thousands of commodity machines at the optimal performance-per-dollar price point - i.e. not the more expensive leading edge - all knit together by fault-tolerant distributed algorithms written in-house. Some time later, Google published their MapReduce and BigTable papers, describing the algorithms they'd used to manage and control this massively more cost-effective and powerful supercomputer. Deepseek is MUCH more like the Google moment, because Google essentially described what it did and told everyone else how they could do it too. In Google's case, a fair bit of time elapsed between when they revealed to the world what they were doing and when they published a papers showing everyone how to do it. Deepseek, in contrast, published their paper alongside the model release. Now, I've also written about how I think this is also a demonstration of Deepseek's trajectory, but that's also no different from Google in ~2004 revealing what it was capable of. Competitors will still need to gear up and DO the thing, but they've moved the field forward. But it's not like Sputnik where the Soviets have developed technology unreachable to the US, it's more like Google saying, "Hey, we did this cool thing, here's how we did it." There is no reason to think nVidia and OAI and Meta and Microsoft and Google et al are dead. Sure, Deepseek is a new and formidable upstart, but doesn't that happen every week in the world of AI? I am sure that Sam and Zuck, backed by the power of Satya, can figure something out. Everyone is going to duplicate this feat in a few months and everything just got cheaper. The only real consequence is that AI utopia/doom is now closer than ever. ==== Bonus: This is also a little similar the Ethereum PoS moment, when AI finally has a counterpoint to the environmentalists who say AI uses so much electricity. We just brought down the cost of inference by 97%!

English

296

1.2K

6.5K

864.1K

Arin retweetet

anton@abacaj·29 Oca

GRPO (used on deepseek) is really cool, trying it out on Qwen2-0.5B. The model is generating its own thinking tokens. Correctness reward going up as completion length increases (gsm8k). H/T @willccbb for the initial gist

English

684

76.4K

Arin retweetet

Kevin Patrick Murphy@sirbayes·28 Oca

I read the R1 zero paper and the method is very simple , just a tweak to PPO to fine tune deepseek v3 base using a verifiable sparse binary reward. The fact that they got it to work even though others failed is likely due to better data and/or their very efficient implementation

thebes@voooooogel

why did R1's RL suddenly start working, when previous attempts to do similar things failed? theory: we've basically spent the last few years running a massive acausally distributed chain of thought data annotation program on the pretraining dataset. deepseek's approach with R1 is a pretty obvious method. they are far from the first lab to try "slap a verifier on it and roll out CoTs." but it didn't used to work that well. all of a sudden, though, it did start working. and reproductions of R1, even using slightly different methods, are just working too--it's not some super-finicky method that deepseek lucked out finding. all of a sudden, the basic, obvious techniques are... just working, much better than they used to. in the last couple of years, chains of thought have been posted all over the internet (LLM outputs leaking into pretraining like this is usually called "pretraining contamination"). and not just CoTs--outputs posted on the internet are usually accompanied by linguistic markers of whether they're correct or not ("holy shit it's right", "LOL wrong"). this isn't just true for easily verifiable problems like math, but also fuzzy ones like writing. those CoTs in the V3 training set gave GRPO enough of a starting point to start converging, and furthermore, to generalize from verifiable domains to the non-verifiable ones using the bridge established by the pretraining data contamination. and now, R1's visible chains of thought are going to lead to *another* massive enrichment of human-labeled reasoning on the internet, but on a far larger scale... the next round of base models post-R1 will be *even better* bases for reasoning models.

English

454

60.2K

Entdecken

@CANehaNiharika @congratstothe @thundermeowx @robmooreEsports @ordax @beffjezos @ananyachdh @quanderAI