yalishandi

23 posts

yalishandi

yalishandi

@yalishandi

chief executive ic

Katılım Eylül 2024
202 Takip Edilen1 Takipçiler
yalishandi
yalishandi@yalishandi·
@MoonL88537 Looool at the 9.1 = 9.10 > 9.11 These model explanations remind me of the demons in Frieren (animals that evolved the ability to speak purely as a means to deceive humans, with no moral implications)
English
0
0
0
28
Moon
Moon@MoonL88537·
did i mention that this is totally nuts?
Moon tweet media
English
184
423
6.1K
674.3K
Jamon
Jamon@jamonholmgren·
Do you really think precise location will matter for this query, Google?
Jamon tweet media
English
156
377
13K
279.7K
yalishandi
yalishandi@yalishandi·
Subway surfers is an adversarial example for humans, discovered with blackbox optimization
English
0
0
0
33
yalishandi
yalishandi@yalishandi·
@kalomaze "Human beings fight not because they are different, but because they are the same, and in their attempts to distinguish themselves have made themselves into enemy twins, human doubles in reciprocal violence." -- Girard
English
0
0
2
373
kalomaze
kalomaze@kalomaze·
the Cold War-esque dynamic between vllm and sglang needs to be studied
English
4
1
116
23.5K
yalishandi
yalishandi@yalishandi·
@jadechoghari @huggingface Build our product to "get on the radar for Hugging Face internships" The job market is what is, respect the grift 🤣🤷
English
0
0
0
142
jade
jade@jadechoghari·
🚨 Announcing: @huggingface @ Waterloo 🤗 A direct path for waterloo students to go from learning ML → shipping code with the HF team → contributing to state-of-the-art open-source models → path to internship and Fellowship—all while building a standout resume. 📝 Applications open now — only 5 to 10 spots per term! 👉 tally.so/r/wdgG5K 🔍 Learn more: hfwaterloo.notion.site 👩‍💻 Two tracks: 1️⃣ Junior ML Scholar (1st/2nd yr) – build @Gradio demos, explore models, and engage with the HF community. 2️⃣ Senior ML Scholar (3rd/4th yr) – work with HF engs to contribute models, test SOTA, and improve core libraries.
jade tweet media
English
12
21
162
36.1K
Ashwinee Panda @ICLR2026
Ashwinee Panda @ICLR2026@PandaAshwinee·
@ZainHasan6 cool. so how is it made exactly? it’s typically the case that anything at the end of the cot has much higher conf bc the model has conditioned on so much stuff.
English
2
0
1
99
Zain
Zain@ZainHasan6·
mildly interesting average <think> token probabilities are always lower than answer tokens probs for DeepSeek-R1 sample set: AIME 2024 30 problems
Zain tweet media
English
1
0
5
752
Tianle Cai
Tianle Cai@tianle_cai·
As an RL newbie I came across a very similar idea recently and was shocked to see that this natural idea (using the loss reduction over the ground truth answer when adding CoT before it) was (to my limited RL knowledge) only covered by the following and another very recent paper (arxiv.org/abs/2503.19618) lol. Did I miss any literature on this, or does it simply not work well?
Nathan Lambert@natolambert

Underrated paper and idea on using RL losses on non-verifiable domains, in this case the perplexity of the next chapter of a book.

English
12
11
149
35.7K
yalishandi
yalishandi@yalishandi·
And then they came for the s&p 500 -- and there was no one left to speak out
English
0
0
0
41
yalishandi
yalishandi@yalishandi·
High schooler first time reviewing for #icml2025. I made sure to hit the "acknowledge rebuttal" the instant it hit my inbox to make sure authors know i'm engaged with the process 😊
English
0
0
1
1.2K
yalishandi
yalishandi@yalishandi·
@StartupArchive_ The answer here -- as machine learners will tell you -- is not *no evals*, but a train test split. Find several metrics that correlate with downstream goodness. Hold some of them out, so that teams must try to improve the underlying "g factor" of these metrics
English
0
0
0
95
Startup Archive
Startup Archive@StartupArchive_·
Shopify CEO Tobi Lutke explains Goodhart’s law and why he doesn’t like KPIs or OKRs “Goodhart’s law is real. The moment a metric becomes a goal, it’s no longer a useful metric… No metric by itself is a complete heuristic for a complex business. There’s a million different tensions in a company, and you can’t keep all of them in harmony by optimizing for one thing.” For this reason, Shopify doesn’t use KPIs or OKRs. But as Tobi explains, this doesn’t mean they don’t value data and metrics. “We are extremely data informed. We have invested enormous amounts of money and time into systems that give us basically everything at our fingertips… But what Shopify attempts to do is just not over-fit for what’s quantifiable.” People love optimizing for highly-quantifiable things because there’s immediate gratification that comes from seeing a number go up. But Tobi thinks that the most important aspects of a product are rarely quantifiable: “The overlap of the most valuable things you can do with a product and the things that happen to be fully quantifiable are like maybe 20%. Which leaves 80% of a value space unaddressable by the people who only look at quantifiable things.” He continues: “Shopify is comfortable with unquantifiable things like taste, quality, passion, love, hate… The sort of deep satisfaction that a craftsperson feels when they’ve done a job well is actually a better proxy if you allow it to be.” They then have robust analytics systems that tell the company if something’s wrong or a new rollout breaks something. “We think about it as a cockpit for a pilot. The decisions are still made by pilots, and we think this leads to better results… I think there needs to be more acceptance in business of unquantifiable things… And then metrics take a support function.” Video source: @lennysan (Feb 2025)
English
23
171
1.4K
171.1K
yalishandi
yalishandi@yalishandi·
@g_k_swamy Hi Gokul, great paper! I'm studying it carefully. My question: under the generator-verifier gap explanation, should we expect training with a verifier (when there is such a gap) on off-policy data to match online methods? What is the natural way to do so?
English
0
0
0
181
Gokul Swamy
Gokul Swamy@g_k_swamy·
1.5 yrs ago, we set out to answer a seemingly simple question: what are we *actually* getting out of RL in fine-tuning? I'm thrilled to share a pearl we found on the deepest dive of my PhD: the value of RL in RLHF seems to come from *generation-verification gaps*. Get ready to🤿!
Gokul Swamy tweet media
English
24
227
1.8K
290.3K
yalishandi
yalishandi@yalishandi·
@nrehiew_ If you have a decent prior, you need much stronger evidence than this to believe the guy is "in the trenches"
English
0
0
1
608
wh
wh@nrehiew_·
The CEO is quite literally credited as 1/3 people who came up with DualPipe. This isn't just some overseeing/managerial role. He is quite literally in the trenches.
wh tweet media
DeepSeek@deepseek_ai

🚀 Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies ✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training. 🔗 github.com/deepseek-ai/Du… ✅ EPLB - an expert-parallel load balancer for V3/R1. 🔗 github.com/deepseek-ai/ep… 📊 Analyze computation-communication overlap in V3/R1. 🔗 github.com/deepseek-ai/pr…

English
15
34
760
47.1K
Josh Clymer
Josh Clymer@joshua_clymer·
some people tell me: 'i'm an ai risk skeptic, there just isn't much evidence that there's a real risk' well I'm an ai risk skeptic too then. when will superintelligence exist? will it be easy to align? no one knows. that's the problem
English
10
4
107
5.3K
yalishandi
yalishandi@yalishandi·
@thinkymachines Dug up the grave of character ai, ransacked it, and buried it again
English
0
0
0
98
Thinking Machines
Thinking Machines@thinkymachines·
Today, we are excited to announce Thinking Machines Lab (thinkingmachines.ai), an artificial intelligence research and product company. We are scientists, engineers, and builders behind some of the most widely used AI products and libraries, including ChatGPT, Character.ai, PyTorch, and Mistral. Our mission is to make artificial intelligence work for you by building a future where everyone has access to the knowledge and tools to make AI serve their unique needs. We are committed to open science through publications and code releases, while focusing on human-AI collaboration that serves diverse domains. Our approach embraces co-design of research and products to enable learning from real-world deployment and rapid iteration. This work requires three core foundations: state-of-the-art model intelligence, high-quality infrastructure, and advanced multimodal capabilities. We are committed to building models at the frontier of capabilities to deliver on this promise. If you’re interested in joining our team, consider applying here: 6wajk07p.paperform.co
English
296
516
5K
2.1M
Yong Lin
Yong Lin@Yong18850571·
🚀 Introducing Goedel-Prover: A 7B LLM achieving SOTA open-source performance in automated theorem proving! 🔥 ✅ Improving +7% over previous open source SOTA on miniF2F 🏆 Ranking 1st on the PutnamBench Leaderboard 🤖 Solving 1.9X total problems compared to prior works on Lean Workbook [1/n] website: goedel-lm.github.io huggingface: huggingface.co/Goedel-LM/Goed… github: github.com/Goedel-LM/Goed… Amazing collaborators: @sangertang1999 (co-first author) @Lyubh22 @wujiayun12 @hongzhou__lin @KaiyuYang4 @JiaLi52524397 @xiamengzhou @danqi_chen @prfsanjeevarora @chijinML
Yong Lin tweet media
English
13
65
272
87.2K
yalishandi
yalishandi@yalishandi·
above the chinese restaurant is cute and all, but imagine the roaches
English
0
0
0
92
yalishandi
yalishandi@yalishandi·
Are markets efficient when they take 4 years to price in switch transformers (2021)?
English
0
0
0
85
yalishandi
yalishandi@yalishandi·
Intelligence is simply compound intuition
English
0
0
0
72