Sam Ching

516 posts

Sam Ching

Sam Ching

@samcwl

🤖

SF Beigetreten Mart 2017
962 Folgt1.6K Follower
swyx
swyx@swyx·
ok life update: i'll be joining @Cognition! • Cog just went from 0 to $10b in 2 years • Net burn $20m in company history • Avg successful Devin impl sees >5x growth, $1.5m/yr customer expanded >10x in 8 months (not typo) • Windsurf x Cog cross sell going great, looking fwd to cross build. Lighthouse customers in every vertical. • Windsurf's polish + Wave 13/14 are looking v HYPE! more work to do on Windsurf Tab Here's the 5 decision points I had to cross to "buy" Cognition's rise as an Agent Lab that has the potential to be as dominant in the Decade of Agents: • Short Code timelines, Long AGI timelines • The rise of Agent Labs • Owning the Sync/Async spectrum • Start high, go low • Cracked Engineers + Ramped GTM This is probably TMI but I like to record thoughts in public so I can learn when I am wrong, and for others to follow along. Happy reading, in reply tweet. also: Most of @smol_ai capital base is rolling over to Cognition's round today. I'll be returning the rest. AINews and SmolTalk will remain my passion projects. I will continue to operate @aiDotEngineer and @latentspacepod ferociously independent of Cognition. All Cognition competitors were given ample heads up + red carpet to AIE CODE. Ben and I are BEYOND excited to have Lia step up as the new General Manager of AIE!
swyx tweet mediaswyx tweet mediaswyx tweet media
Cognition@cognition

We’ve raised over $400M at a $10.2B post-money valuation to advance the frontier of AI coding agents. The round was led by Founders Fund with other existing investors including Lux, 8VC, Neo, Elad Gil, Definition Capital, and Swish VC all doubling down. We’re also joined by new investors including Bain Capital Ventures and D1 Capital. Two of our early investors, Christian Lawless of Conversion Capital and Emily Cohen of Neo, have even joined our team full-time.

English
355
92
2.4K
585.9K
Noam Brown
Noam Brown@polynoamial·
I’ll be in Singapore 🇸🇬 for @iclr_conf this week! Looking forward to catching up with friends and meeting new folks in AI.
English
9
6
322
27K
Sam Ching retweetet
Daniel Ching
Daniel Ching@danielchingwq·
1/n: A thread for local eats+transport in 🇸🇬 for those coming to @ICLR ! For those coming to Singapore for the first time, a huge welcome :D ICLR would be held at the Expo -- closest MRT station: CG1 (Green East-West Line, 1 stop from Changi Airport), DT35 (Blue Downtown Line)
Daniel Ching tweet media
English
1
4
14
2.6K
Aiden Bai
Aiden Bai@aidenybai·
after spending $5k+ running automated browsers: - hosting browsers is annoying asf - existing hosting providers are EXPENSIVE - non-deterministic memory usage leads to over/under provisioning - sometimes just randomly crashes??? - time to write a blog post
Aiden Bai tweet media
English
101
20
1.3K
233.5K
Sam Ching
Sam Ching@samcwl·
@justLV @DrOnwude @sesame Very cool. Kudos on the launch! Also interested if you’ll have a finetuning API or guide down the line.
English
0
0
2
370
Justin Alvey
Justin Alvey@justLV·
@DrOnwude @sesame Thank you! 1-2 weeks. The demo is a fine-tuned version of the base model on the talent's voice that we can't release, but the base model is still extremely capable - you can get a preview of capabilities on the research blog post.
English
10
2
67
4.9K
Justin Alvey
Justin Alvey@justLV·
Excited to share a peek of what I’ve been working on We @sesame believe voice is key to unlocking a future where computers are lifelike Here’s an early preview you can try! 👇 We’ll be open sourcing a model, and yes… we’re building hardware! 🧵
English
185
251
2.2K
450.9K
Rach
Rach@rachpradhan·
6/ Why Rust? Python is great, but: ❌ Async isn’t enough for low-latency optimization ❌ Managing memory efficiently is hard ❌ Handling multiple concurrent requests gets messy ✅ Rust (via PyO3) lets Bhumi run fast under the hood while keeping a Python-friendly interface.
English
1
0
2
146
Ross Taylor
Ross Taylor@rosstaylor90·
This project was a short one-month sprint following the “R1 moment” 🐋 - let me know your feedback! More is cooking depending on how this lands 😇.
English
2
0
19
887
Ross Taylor
Ross Taylor@rosstaylor90·
🎉 Excited to release General Reasoning: a new community resource for building open reasoning models. We’re looking to make personal, open reasoners a reality. Starting with a small step in that direction today! Read the thread in the quote tweet for details, or my personal analysis below!
English
9
37
278
36.5K
Sam Ching retweetet
Brendan (can/do)
Brendan (can/do)@BrendanFoody·
When we started the company at 19, we had grand ambitions, but I never imagined how fast it would happen. I'm incredibly grateful for the team we've built and everything they've accomplished. Labor allocation is the most important problem in the world and we're only cracking the surface.
Mercor@mercor_ai

Mercor is solving talent allocation in the AI economy. The difference between greatness and failure is the right person being in the right place at the right time. Putting them there is the hardest unsolved problem in capitalism. We’re excited to announce our $100M Series B at a $2B valuation from @felicis, @generalcatalyst, @benchmark, DST and @MenloVentures.

English
26
17
234
40.6K
Sam Ching retweetet
Michael Poli
Michael Poli@MichaelPoli6·
[1/7] Introducing Evo 2, a new foundation model for biology. 🚀 Evo 2 is the largest-scale, fully open-source AI model ever released: 40 billion parameters, over 9 trillion tokens, and a 1 million context length. All the details are public: weights, data, training infrastructure, and inference infrastructure. ⚡Evo 2 is built on a new model architecture: convolutional multi-hybrids (StripedHyena 2). StripedHyena 2 excels at modeling byte-tokenized data, providing faster training and lower perplexity compared to both Transformers and previous-generation hybrids based on state-space models. I am grateful for the team behind Evo 2—working with you was one of the proudest moments of my career (the core pretraining team was fewer than five people; you can just do things). 📚 Today, we release two papers (yes, plural), as well as weights, data, training, and inference codebases. Enjoy!
Michael Poli tweet mediaMichael Poli tweet mediaMichael Poli tweet media
English
25
86
477
74.5K
Sam Ching
Sam Ching@samcwl·
@n0riskn0r3ward Great thread! Tks for sharing. Would love to hear more about 6c - what are the org incentives that shape the odds this way?
English
0
0
1
471
search founder
search founder@n0riskn0r3ward·
6/ Conclusions: After all this I simultaneously believe: a) in the long run, it's still likely that many successful gen ai startups will be built on top of models they iteratively fine tune. It's not easy but you 100% can actually build a moat this way b) no one should start with fine tuning c) the odds are stacked against most larger incumbents ever succeeding at getting their data/eval house in such tight shape that the best use of their marginal hour is fine tuning their own models (7/7)
English
12
5
179
17.7K
search founder
search founder@n0riskn0r3ward·
I spent the last ~6 months fine-tuning models at Arcee AI for a wide variety of clients ranging from Fortune 500 enterprises to 2 person gen ai native startups. Short version of the 🌶️ takes in this thread: No one fine tuning models for clients is a “machine learning engineer” most of the week. If they’re fine tuning good models for third parties they’re doing it by being a skilled and tireless data janitor and eval architect. Longer thread with some 🌶️ takes on the practical challenges and sometimes painful realities of fine tuning for clients (1/7):
English
43
142
1.7K
294.2K
Sam Ching
Sam Ching@samcwl·
@eddy_data3 Gotcha totally makes sense! This looks promising though - excited to see more work in this area (bootstrapping environments from webscale data) Kudos on the work!
English
0
0
1
21
eddy
eddy@eddy_data3·
@samcwl Hi Sam! Thanks for your comments. Yes we definitely could have added an ablation for filtered webinstruct data in table 2. At that time, we were more focused on assessing the impact of noise in the SFT data, so we also preserved the noise in its correctness.
English
1
0
0
14
Sam Ching
Sam Ching@samcwl·
One qn: > 5.1: In contrast, for data from WebInstruct without fully reliable supervision signals but with a much larger scale, we sample one response per prompt from the teacher model without filtration. -> why not use filtered subset and do rejection sampling for SFT?
English
1
0
1
128
ivan
ivan@IvanVendrov·
anyone have a hosted instance of deepseek-v3-base that I can query?
English
10
2
18
2.7K