Victor Quach

202 posts

Victor Quach banner
Victor Quach

Victor Quach

@Varal7

Researcher | Machine Learning PhD from @MIT_CSAIL | @Polytechnique alum (X2014)

Katılım Aralık 2009
357 Takip Edilen379 Takipçiler
Victor Quach retweetledi
Gappy (Giuseppe Paleologo)
Gappy (Giuseppe Paleologo)@__paleologo·
A minor lesson I learned. Hudson River Trading has large cafeterias in all of its centers. And an abnormal amount of communal spaces (alcoves, booths, meeting room). You sit for lunch and talk to strangers or old friends. I estimate that premier real estate space (and the chef-served lunches) to cost $20-30m/yr. It is a 100x return/yr. Eating together is how you get people to lower their defenses, talk, trust others, collaborate, create alliances since the Pleistocene. And the lesson is that you can win at technology by going back to very simplest, ancestral things.
Dr. Dominic Ng@DrDominicNg

Sharing 8+ meals a week with people has the same happiness boost as DOUBLING your income. And yet we eat alone 53% more than we did 20 years ago.

English
27
99
2.2K
325.3K
Victor Quach retweetledi
Hudson River Trading
Hudson River Trading@WeAreHRT·
HRT AI Labs (HAIL) is building some of the fastest and most predictive deep learning models in trading. There’s only one way to see these systems in action: Join, and help us build them. hudsonrivertrading.com/machine-learni…
English
0
6
53
7.1K
Victor Quach retweetledi
Iain Dunning
Iain Dunning@iaindunning·
Are you a researcher at OAI/Anthropic/etc and tired of overhiring, the orgchart chaos, the lowered talent bar, want to move to NYC, or just want to do something different? Email me, DM me, mail a postcard. We've got a new datacenter full of B200s, tight team, and very successful.
English
27
14
754
551.2K
Victor Quach retweetledi
Hudson River Trading
Hudson River Trading@WeAreHRT·
There are few places in the world where you can train deep learning models at scale. Last week at @Stanford, HRT AI Labs (HAIL) senior researcher Marc discussed training foundation models robust to market regime shifts on massive datasets and under low-latency demands.
Hudson River Trading tweet media
English
3
11
136
25.5K
Victor Quach
Victor Quach@Varal7·
@DrJimFan The idea of repeatedly sampling at inference time is not novel arxiv.org/pdf/2306.10193. What’s hard is: (for academics) quantifying the improvement and deriving the scaling laws, and (for the industry) productionizing it correctly
English
0
0
2
97
Jim Fan
Jim Fan@DrJimFan·
OpenAI Strawberry (o1) is out! We are finally seeing the paradigm of inference-time scaling popularized and deployed in production. As Sutton said in the Bitter Lesson, there're only 2 techniques that scale indefinitely with compute: learning & search. It's time to shift focus to the latter. 1. You don't need a huge model to perform reasoning. Lots of parameters are dedicated to memorizing facts, in order to perform well in benchmarks like trivia QA. It is possible to factor out reasoning from knowledge, i.e. a small "reasoning core" that knows how to call tools like browser and code verifier. Pre-training compute may be decreased. 2. A huge amount of compute is shifted to serving inference instead of pre/post-training. LLMs are text-based simulators. By rolling out many possible strategies and scenarios in the simulator, the model will eventually converge to good solutions. The process is a well-studied problem like AlphaGo's monte carlo tree search (MCTS). 3. OpenAI must have figured out the inference scaling law a long time ago, which academia is just recently discovering. Two papers came out on Arxiv a week apart last month: - Large Language Monkeys: Scaling Inference Compute with Repeated Sampling. Brown et al. finds that DeepSeek-Coder increases from 15.9% with one sample to 56% with 250 samples on SWE-Bench, beating Sonnet-3.5. - Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. Snell et al. finds that PaLM 2-S beats a 14x larger model on MATH with test-time search. 4. Productionizing o1 is much harder than nailing the academic benchmarks. For reasoning problems in the wild, how to decide when to stop searching? What's the reward function? Success criterion? When to call tools like code interpreter in the loop? How to factor in the compute cost of those CPU processes? Their research post didn't share much. 5. Strawberry easily becomes a data flywheel. If the answer is correct, the entire search trace becomes a mini dataset of training examples, which contain both positive and negative rewards. This in turn improves the reasoning core for future versions of GPT, similar to how AlphaGo’s value network — used to evaluate quality of each board position — improves as MCTS generates more and more refined training data.
Jim Fan tweet media
English
135
1.1K
6.1K
799.6K
Rona Wang
Rona Wang@ronawang·
my friend just got a quant trading offer & showed me his $500k/year entry-level offer … i chose the wrong industry 🥲
English
158
169
4.4K
1.5M
Victor Quach retweetledi
Adam Fisch
Adam Fisch@adamjfisch·
Excited to share a new pre-print on "Conformal Language Modeling". LMs sample generations from an unbounded output space. We extend conformal prediction to handle this sort of prediction process, and give it rigorous performance guarantees. Paper: arxiv.org/abs/2306.10193
English
3
49
195
44.5K
Stephen Mayhew
Stephen Mayhew@mayhewsw·
when everyone in NLP is having an existential crisis because of GPT-4, but you lived through ELMo/BERT in 2019:
Stephen Mayhew tweet media
English
15
91
957
148.9K
Victor Quach
Victor Quach@Varal7·
@JamesADiao To be fair, black’s moves are plausible answers to white’s moves if you only consider the last few previous moves. In fact, ChatGPT originally plays 27. … Rg2# (pastebin.com/X6kBRTa9)
English
1
0
0
69
James Diao
James Diao@JamesADiao·
ChatGPT playing 4D chess!
English
1
0
3
828
Victor Quach retweetledi
YujiaBao
YujiaBao@yujia_bao·
Perplexed to announced that Learning to Split with 6-6-8 reviews has been rejected #NeurIPS2022 😂 Nonetheless, I am excited to present our manuscript + code: learn to split any dataset with one line of code to break your model’s generalization. (1/3) github.com/YujiaBao/ls
English
2
19
120
0
Victor Quach retweetledi
Adam Fisch
Adam Fisch@adamjfisch·
Very excited to share our work on Conformal Risk Control. We give a calibration algorithm that is not only remarkably simple, but also comes with elegant and powerful theoretical guarantees on its ability to control any monotone risk—not just coverage (like CP). Check it out!
Anastasios Nikolas Angelopoulos@ml_angelopoulos

I’m thrilled to announce Conformal Risk Control: a way to bound quantities other than coverage with conformal prediction. arxiv.org/abs/2208.02814 Check out the worked examples in CV and NLP! The best part is: it’s exactly the same algorithm as split conformal prediction🤯🧵1/5

English
0
7
24
0
Miles Cranmer
Miles Cranmer@MilesCranmer·
Today I learned you can write numbers like this in Python (!!) Makes it easier to read long numbers by separating digits into groups, just like 1,000,000. It’s so esoteric that Google Colab doesn’t even color it correctly!
Miles Cranmer tweet media
English
30
113
1.2K
0
Victor Quach retweetledi
MIT Jameel Clinic for AI & Health
Congratulations to Regina Barzilay, Jameel Clinic AI Faculty Lead, for being elected by @aimbe to its College of Fellows. Dr. Barzilay was nominated for her breakthrough contributions in machine learning for early cancer diagnosis and drug discovery. aimbe.org/college-of-fel…
MIT Jameel Clinic for AI & Health tweet media
English
0
12
16
0
Victor Quach retweetledi
MIT Jameel Clinic for AI & Health
Adam Yala (@YalaTweets) and his team at Jameel Clinic have published their new paper "Robust Mammography-based Models for Breast Cancer Risk" in @ScienceTM. "Mirai," the mammography-based model, achieves consistent accuracy across diverse populations: youtu.be/pCGnRDf0Fmo
YouTube video
YouTube
English
0
15
31
0
Victor Quach retweetledi
Papers with Code
Papers with Code@paperswithcode·
🎉 Papers with Code partners with arXiv! Code links are now shown on arXiv articles, and authors can submit code through arXiv. Read more: medium.com/paperswithcode…
Papers with Code tweet media
English
42
1.7K
6.1K
0
Victor Quach retweetledi
MIT Jameel Clinic for AI & Health
We are excited to announce the AI Cures Conference: Data-driven Clinical Solutions for COVID-19, which will take place online on September 29th. Learn more and register here, spots are limited: aicures.mit.edu/conference
MIT Jameel Clinic for AI & Health tweet media
English
0
32
44
0
Victor Quach retweetledi
MIT Jameel Clinic for AI & Health
On Friday, August 21 at 8pm ET, Greater Boston’s beloved Coolidge Corner Theater (@thecoolidge) is hosting a virtual premiere of From Controversy to Cure, a documentary film chronicling the biotech boom in Cambridge. Learn more and watch it here: bit.ly/31fqWCU
MIT Jameel Clinic for AI & Health tweet media
English
1
14
19
0