Jaskaran Singh

2.6K posts

Jaskaran Singh banner
Jaskaran Singh

Jaskaran Singh

@jasksing

Humanity is Awesome! | Engineer

เข้าร่วม Haziran 2021
1.2K กำลังติดตาม178 ผู้ติดตาม
Jaskaran Singh รีทวีตแล้ว
Jaynit
Jaynit@jaynitx·
Terence Tao: "Previously, you needed a PhD to contribute to math research. Now a high school student can." Dwarkesh asks the world's most famous mathematician: what's your advice for someone considering a career in math, especially in light of AI progress? Tao is honest about uncertainty: "We live in a time of change. A particularly unpredictable era. Things that we've taken for granted for centuries may not hold anymore. The way we do everything... not just mathematics... will change." He admits his preference: "In many ways, I would prefer a much more boring, quiet era where things are much the same as they were 10 or 20 years ago. But one just has to embrace this. There's going to be a lot of change. The things you study... some of them may become obsolete or revolutionized. But some things will be retained." On new opportunities: "Previously, you had to go through years and years of education and get a math PhD before you could contribute to the frontier of math research. But now it's quite possible at the high school level that you could get involved in a math project and actually make a real contribution... because of all these AI tools and Lean and everything else." His advice: "There will be a lot of non-traditional opportunities to learn. You need a very adaptable mindset. There'll be worth pursuing things just for curiosity and for playing around. Still go through traditional education and learn math and science the old-fashioned way for a while... credentials will still be important. But you should also be open to very, very different ways of doing science. Some of which don't exist yet." He concludes: "It's a scary time. But also very exciting."
English
19
130
677
76.6K
Jaskaran Singh รีทวีตแล้ว
Reyaa
Reyaa@snr_boost·
@kingofknowwhere Dumb person's idea of a smart person. That doesn't mean he's not smart. He probably is given his credentials. But it's not his job to know nitty gritty of GANs or PCPs
Reyaa tweet media
English
1
1
16
1.5K
Jaskaran Singh รีทวีตแล้ว
Ravid Shwartz Ziv
Ravid Shwartz Ziv@ziv_ravid·
New episode of The Information Bottleneck is out, this time with @liuzhuang1234 (Princeton). We talked about ConvNeXt and whether architecture still matters; dataset bias and what "good data" actually looks like; ImageBind and why vision is the natural bridge across modalities; CLIP's blind spots; memory as the real bottleneck behind the agent hype; whether LLMs have world models; and Transformers Without Normalization. For years, the vision community debated what actually matters: architecture, inductive bias, self-attention vs convolution. After a lot of back-and-forth, we ended up in a funny place: ViT and ConvNet give roughly the same performance once you tune the details. What I find interesting is that once you reach a certain performance level, it becomes much easier to swap and tweak components without really changing the outcome. Talking to Zhuang on this episode, I kept wondering whether the same is now true for LLMs. If we wil spent serious time on an alternative architecture today, would you actually get a meaningfully different model, or just land on the same Pareto curve with extra steps? I'm starting to suspect it's the latter. Architecture matters less than we think. Data, compute, and a handful of pillars do most of the work.
English
5
14
58
25.5K
wesley hsieh
wesley hsieh@chengyenhsieh·
This developer has reproduced many classic works, including ViT, AlphaFold3, DDPM, Imagen, and DALL·E. Whenever I want to cross-check the details of a paper with code, I often end up looking at his implementation. On one hand, his work is incredibly impressive from an educational perspective. On the other hand, I rarely see someone who has done so much work yet remains so silent on social media. Lucidrains: github.com/lucidrains
wesley hsieh tweet media
English
19
59
990
44.4K
Paras Chopra
Paras Chopra@paraschopra·
@justalexoki because nothing is impossible to exist (but the real question is why something so specific rather than something else so specific)
English
16
0
41
4.9K
taoki
taoki@justalexoki·
actually why is there something rather than nothing
English
425
31
778
80.2K
Paras Chopra
Paras Chopra@paraschopra·
i'm actually surprised by the replies from people who believe math/physics/cs will never be automated. even current systems are at the level of a grad student, but the anons commenting on human supremacy seem to be living in a cave.
Paras Chopra@paraschopra

What advice should one give to kids to prepare for the future? I used to think mastering basics of physics, math, cs is the way to go but now I’ve updated my belief as these fields will get automated soon. What we need kids to learn is personality traits like grit, resourcefulness, optimism, resilience, etc.

English
41
7
279
22.3K
Paras Chopra
Paras Chopra@paraschopra·
Steal this idea. Here’s a semi research project that’s been on my mind for a while. Take frontier sovereign models of each country and subject them to multiplayer war games, to come up with a elo based leaderboard. This should be a wake up call for nations.
English
18
4
238
14.6K
Paras Chopra
Paras Chopra@paraschopra·
What advice should one give to kids to prepare for the future? I used to think mastering basics of physics, math, cs is the way to go but now I’ve updated my belief as these fields will get automated soon. What we need kids to learn is personality traits like grit, resourcefulness, optimism, resilience, etc.
English
158
58
896
89.8K
Jaskaran Singh
Jaskaran Singh@jasksing·
Open data open weights with detailed technical report in this economy??
Kangwook Lee@Kangwook_Lee

My team has been cooking nonstop for a while... and I’m so excited to finally share what we’ve been building!!! Today, we’re releasing four open models, many of which are the best models of the same size 🥳!!! tldr; 1) Raon-Speech: 9B SOTA speech LLM 2) Raon-SpeechChat: 9B full duplex model 3) Raon-OpenTTS: 0.3B/1B open-data-open-weight SOTA TTS 4) Raon-VisionEncoder: 0.4B vision encoder trained only with public data huggingface.co/collections/KR… === 1) Raon-Speech (9B) Raon-Speech is a speech LLM (LLM + speech understanding + speech generation). It's a bilingual model (English/Korean), and it's ranked #1 on both leaderboards 😎 tldr; it's the best open-model alternative to ChatGPT voice mode. Model: huggingface.co/KRAFTON/Raon-S… Tech report: huggingface.co/KRAFTON/Raon-S… Web demo: raon.krafton.ai ("Speech Chat" menu here. "auto" is a bit unstable, so use "manual" and choose the language!) 2) Raon-SpeechChat (9B) While a speech LLM is useful, it’s kind of like a walkie-talkie. A full-duplex model is more like a phone, so it is even more useful in many applications. That’s why we also built and are releasing Raon-SpeechChat. Again, on several quantitative evaluation metrics, Raon-SpeechChat scored the best on average. Model: huggingface.co/KRAFTON/Raon-S… Tech report: huggingface.co/KRAFTON/Raon-S… Web demo: raon.krafton.ai ("Full Duplex" menu here.) 3) Raon-OpenTTS (0.3B, 1B) We’re also releasing Raon-OpenTTS, a state-of-the-art open-data, open-weight TTS model. Model + data: huggingface.co/KRAFTON/Raon-O… The 1B model and a detailed tech report are coming soon! 4) Raon-VisionEncoder (0.4B) Last but not least, we’re releasing Raon-VisionEncoder, a vision encoder trained from scratch using only public data. It closely matchs the SOTA vision encoder quality too! Model: huggingface.co/KRAFTON/Raon-V… Tech blog: krafton.ai/blog/posts/202… === That’s it! I’m incredibly proud of what my team has built! My AI research team at KRAFTON (@Krafton_AI), which undoubtedly is the most cracked team in Korea, has been cooking nonstop for a while for this 😅... This is just the beginning of our planned model releases, so stay tuned! ps1/ Ah, by the way, you may ask why “Raon”? “Raon” is an old Korean word meaning happy. And, well, we’re kRAftON :-) ps2/ KRAFTON is one of the four teams participating in Korea’s national frontier-model project, together with SK Telecom. We’re training something very exciting together... and more to come soon!

English
0
0
2
69
Peter Pike
Peter Pike@AuthorPeterPike·
@DrChrisCombs The line is above the dog's head and we don't have anything saying the pole nor dog are the same size on both sides, so strictly speaking the answer is unknowable. Making assumptions that the gaps don't matter and the objects are the same size, the dog is 50 cm.
English
4
4
102
21.6K
Jaskaran Singh รีทวีตแล้ว
Tanmay
Tanmay@imnottanmay·
@HarveenChadha Pradhan Mantri har ghar backprop yojna
Svenska
3
3
40
1.1K
Jaskaran Singh
Jaskaran Singh@jasksing·
was waiting for JEPA to be in Audio. Clearly working in latent space prove to be effective!
English
0
0
1
74
Ishaan
Ishaan@auto_grad_·
@KathuriaAyoosh everything in DL is theoritical, there's no point in picking up DL if you're afraid to put a bit of pressure on your brain to understand stuff
English
2
0
5
861
Ishaan
Ishaan@auto_grad_·
if you want to learn the beautiful domain of LLM-RL, this is the basic path i would suggest: > go through david silver's playlist (with sutton's book along side you (it has stuff which david didn't cover in classes)) > go through the policy grad blog by karpathy > try to formulate the MDP for applying RL on LLMs without any external help (think hard on this) > once you get it, implement ppo via pytorch and play with hyper params
English
9
35
478
19.2K
W
W@voughtboy·
somehow cleared GATE. might do masters from IIT.
English
68
8
1.1K
46.1K
Jaskaran Singh รีทวีตแล้ว
Pedro Domingos
Pedro Domingos@pmddomingos·
Geoff Hinton set out to figure out how the brain works and failed. Andrew Ng set out to build a complete robot and failed. Demis Hassabis set out to achieve AGI using deep RL and failed. Yet they all succeeded.
English
32
32
566
39.4K