will ye

1.6K posts

will ye

will ye

@will__ye

member of whimsical staff, ex-ramp applied ai

nyc → sf Katılım Mayıs 2016
473 Takip Edilen5K Takipçiler
Framework
Framework@FrameworkPuter·
@vmfunc Hi, we don't have a dedicated phone line for support but please create a ticket here so we can investigate: frame.work/support
English
7
0
55
7.4K
Thomas Maxwell
Thomas Maxwell@tomaxwell·
What do you mean SF doesn’t have a party scene? Can’t you tell this leetcode party was crazy 🤪
Thomas Maxwell tweet media
English
26
2
70
64.2K
will ye
will ye@will__ye·
@threebarebears My sister said you named the protagonist after them, is that true or are they capping
English
0
0
3
871
Daniel Chong
Daniel Chong@threebarebears·
EVERYONE! If you're in SF this Sunday, come hear an awesome behind the scenes talk about the making of #hoppers , featuring some talented key folks from the movie!! (i'll be in the back cheering them on, come say hi) 💛
The Walt Disney Family Museum@WDFMuseum

“Hop” over to our theater on Sunday, March 22 to earn how story and animation inform each other to create the world of of Disney and Pixar’s latest feature film, "Hoppers" (2026), with Pixar’s Margaret Spencer, John Cody Kim, James W. Brown, and Cody Lyon. bit.ly/4l6kXaC

English
19
182
2.8K
176.1K
Julia Turc
Julia Turc@juliarturc·
@_avichawla I’m glad you found my video helpful since 80% of this post is repurposing it. Next time you borrow my content just tag me, we can cross promote. youtu.be/jrJKRYAdh7I?si…
YouTube video
YouTube
English
3
5
151
8.6K
Avi Chawla
Avi Chawla@_avichawla·
You're in a Research Scientist interview at DeepMind. The interviewer asks: "Our investors want us to contribute to open-source. Gemini crushed benchmarks. But we'll lose competitive edge by open-sourcing it. What to do?" You: "Release a research paper." Here's what you missed: LLMs today don't just learn from raw text; they also learn from each other. For example: - Llama 4 Scout & Maverick were trained using Llama 4 Behemoth. - Gemma 2 and 3 were trained using Gemini. Distillation helps us do so, and the visual explains 3 popular techniques. 1️⃣ Soft-label distillation Generate token-level softmax probabilities over the entire corpus using: - A frozen pre-trained Teacher LLM. - An untrained Student LLM. Train the Student LLM to match the Teacher's probabilities. In soft-label distillation, access to Teacher's probabilities gives max knowledge transfer. But you must have access to the Teacher’s weights to get the probability distribution. Even if you have access, there's another problem! Say your vocab has 100k tokens and data has 5 trillion tokens. Storing softmax probabilities over the entire vocab for each input token needs 500M GBs of memory under fp8 precision. The second technique solves this. 2️⃣ Hard-label distillation - Use the Teacher LLM to get the output token. - Get the softmax probs. from the Student LLM. - Train the Student to match Teacher's output. DeepSeek-R1 was distilled into Qwen & Llama using this technique. 3️⃣ Co-distillation - Start with an untrained Teacher and Student LLM. - Generate softmax probs over the current batch from both models. - Train the Teacher LLM on the hard labels. - Train the Student LLM to match softmax probs of the Teacher. Meta used co-distillation to train Llama 4 Scout and Maverick from Llama 4 Behemoth. Of course, during the initial stages, soft labels of the Teacher LLM won't be accurate. That is why Student LLM is trained using both soft labels + ground-truth hard labels. ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.
Avi Chawla tweet media
English
10
24
174
15.8K
will ye
will ye@will__ye·
@Cmillet77 @im_roy_lee @amasad Not true. People of all ages and professions would rather not have to put in headphones to watch something on twitter in public. And deaf people exist?
English
1
0
20
593
Amjad Masad
Amjad Masad@amasad·
We’ve raised $400M at a $9B valuation. Investors include Georgian, G Squared, Prysm, 1789, YC, Coatue, a16z, Craft, and QIA, with strategic investments from Accenture, Databricks, Okta, and Tether. We’re also lucky to have incredible individuals backing us, including Shaq and Jared Leto. This funding will help us scale our ambition and expand beyond coding into AI systems that center human creativity. Replit is now used at 85% of the Fortune 500. We have an opportunity to help shape the future of work. One where AI abstracts away the boring parts and humans shine as creative directors. We’re also investing more globally, particularly in Europe, Asia, and the Middle East. Innovation can come from anywhere in the world, and we want to help unlock it.
English
520
724
8.3K
2.4M
will ye
will ye@will__ye·
@dunik_7 This is pretty classical ML, nothing here was invented in the last decade. Worth doing for fun though, especially if you’re a beginner
English
0
0
42
4K
dunik
dunik@dunik_7·
a student took the ELO rating system from chess ran it through 95,491 tennis matches over 43 years, and trained an XGBoost model that predicts winners with 85% accuracy he tested it on the Australian Open 2025 completely outside the training data 99 out of 116 matches correct called every single Sinner win through the entire tournament the champion, before the first ball was hit no team, no funding, a laptop and free CSVs from the internet this is the best breakdown of a real sports prediction model I've seen study it or feed it to your AI agent
Phosphen@phosphenq

x.com/i/article/2031…

English
121
310
6.8K
1.1M
Alberto
Alberto@mrvo5·
@YIMBYLAND YAS! Displacement of Brown and Black people. YAS! Gentrification masked as urbanization. YAS! Even more congestion. YAS! More UNAFFORDABLE housing.
English
6
0
9
3.3K
will ye
will ye@will__ye·
@courtne This article is 10x longer than it needs to be, it’s the same stuff rehashed. And as other ppl pointed out, calvin is not the best example lol. A much better one is @AndrewSteinborn, who dropped out of university of west georgia and programmed Minecraft servers
English
0
0
19
1K
will ye
will ye@will__ye·
@jxmnop “The method turned out to be too expensive to run in an academic lab” as soon as I read that I knew there would be LLM labeling somewhere lol
English
0
0
23
1.7K
will ye
will ye@will__ye·
@NWischoff Would feel cozier at a different venue! Looks like a meeting room, especially with the bright lights and office chairs.
English
0
0
3
641
Nichole Wischoff
Nichole Wischoff@NWischoff·
Hosted everyone in SF that drinks wine last night. 20 people attended. Was honestly a blast. Will be doing these regularly with different wine makers. Come join our crew!
Nichole Wischoff tweet media
English
78
2
394
213K
will ye
will ye@will__ye·
@NYSocialBee @calder_mchugh why is it corny? seem reasonable that one should know what a neighborhood is called to judge it. “mind your business” he’s from new york this IS his business
English
0
0
73
1.4K
Social Bee
Social Bee@NYSocialBee·
@calder_mchugh I think this is very corny and I think if it doesn’t apply to you then you should probably just mind your business :) sorry 😇
English
8
0
76
25.7K
Pall Melsted
Pall Melsted@pmelsted·
Excited to share this preprint that describes my latest work on using GPUs to accelerate processing of RNA-seq data. The title says it all: "RNA-seq analysis in seconds using GPUs" now on biorxiv biorxiv.org/content/10.648… Figure 1 shows they key result
Pall Melsted tweet media
English
15
124
481
87.2K
𝕐
𝕐@nomad421·
Dear @AnthropicAI, my lab builds a lot of OSS for genomics (github.com/COMBINE-lab). While we lack the widespread OSS market of popular NPM packages, pieces of our software are critical in biomedical research. Please consider extending your Claude Max offer to such labs!
English
4
13
130
31.2K
lele
lele@CherrilynnZ·
hardware should feel more playful
lele tweet medialele tweet medialele tweet medialele tweet media
English
14
26
352
12.1K
will ye
will ye@will__ye·
@theo I have a friend who does this and $500/problem is very very low
English
0
0
5
279
Theo - t3.gg
Theo - t3.gg@theo·
I would like to purchase a handful of code problems that modern LLMs can’t solve. Requirements: - programmatically verifiable (can be tested without human interaction) - “before” state (repo before the commit that implements the solution) - example code that actually solves the problem I am willing to pay up to $500 per problem that I can easily test locally and confirm current models (gpt-5.3-codex, opus 4.6) are unable to solve. If you can’t tell, I’m running out of “too hard for LLM” code tasks 🙃🙃🙃
English
283
29
1.9K
711.4K
tern
tern@tern_et·
i made a new bag !!
tern tweet mediatern tweet media
English
123
3.3K
39K
2.3M
will ye
will ye@will__ye·
me and bro if we were ham
will ye tweet media
English
3
1
36
1.8K
will ye
will ye@will__ye·
@pumfleet @calcom “I can’t keep milking recurring revenue from ppl who don’t find our subscription worth it”
English
0
0
16
601