OneNine

1.5K posts

OneNine banner
OneNine

OneNine

@OneNine_19X

OneNine, we build the data AI is missing…

Sweden 가입일 Ocak 2022
64 팔로잉283 팔로워
OneNine 리트윗함
Sam Altman
Sam Altman@sama·
The coolest meeting I had this week with was Paul, who used ChatGPT and other LLMs to create an mRNA vaccine protocol to save his dog Rosie. It is amazing story. "The chat bots empowered me as an individual to act with the power of a research institute - planning, education, troubleshooting, compliance, and yes, real scientific design work in converting genomic data to a vaccine prescription and designing the treatment protocol around it. But they worked alongside humans at every step. The combination is what made it possible." It immediately got me thinking "this should be a company". Also, Paul is an extraordinary guy. This should be easy to do, but it is not yet.
Paul S. Conyngham@paul_conyngham

x.com/i/article/2036…

English
1.1K
496
6K
1.7M
OneNine 리트윗함
Doudou BA 🐙
Doudou BA 🐙@doudou_19X·
Did you know that Training LLMs or Transformers in Wolof is cheaper than English? Here's why: How a transformer processes text: text → tokens → embeddings → attention → FFN → repeat ×N layers The bottleneck: attention is O(n²) Every token attends to every other token. Double the tokens = 4x the compute. Now look at Wolof vs English: "I didn't hear you" → 5 tokens "Dégguma" → 1 token "What are you doing?" → 5 tokens "Looy deff?" → 2 tokens "Where is your father?" → 5 tokens "Ana sa baay?" → 3 tokens ~3x fewer tokens. Same semantics. 3² = 9x cheaper attention. Per layer. Per head. Wolof says one word = full sentence English. The transformer architecture rewards this directly. The language was never the problem. The dataset was. We're building it → onenine.dev
English
3
2
9
511
OneNine 리트윗함
Doudou BA 🐙
Doudou BA 🐙@doudou_19X·
⭐️⭐️⭐️ Did Machine Learning just die ? Or is it a Resurrection? I Spent last 2 weeks building a customer-daily forecast model… Revenue, cost, weight, volume… Classic problem with segmented forecasting. If you’ve worked with this type of problem you know already how hard it is… extreme volatility… event based… I tried everything: • complex feature engineering • large pipeline • gradient/lg boosting/prophete even deep learning 😅 Hours and hours of compute, overheating laptop (Mac M2), slow training… laptop end up crashing and mediocre accuracy… Sometimes a model worked for one customer but failed completely for another. Usually I’m an addict if I can’t solve something it irritates me… so spent my weekends/days/nights computing… But yesterday I decided to give up and asked my Boss: «I try again this week if it doesn’t work I park it?» He said «Not an option Dou» 😅 Then I’m like ok let me try @karpathy «autoresearch» setup… Gave the repo Opus4.6 with a little bit of context and it built the .md, the .py … everything… and… since I needed compute power I was running on @Microsoft #Fabric #Notebook + Azure Databricks #PySpark and called @claudeai… It came up with: 1.The exact features I needed 2.Multi model setup 3.Evaluates using rolling backtests 4.Updates the pipeline 5.Iterates again All automated… The interesting part: The system did not choose the models I expected. Instead it genuinely need what exact models would work for this and picked them… • HistGradientBoosting • RandomForest • ElasticNet fallback with rolling CV + lag features + seasonal signals. The system automatically iterated until models stops improving then updates the .md that updates the .py 🤩 so smart! The pipeline ended up doing: • strict time-based train/test splits • lag features (1, 7, 14, 28 days) ⭐️⭐️ • rolling medians for robustness • calendar features (day-of-week patterns) • model selection via rolling backtests The result Model A Very high backtest accuracy (~97%) Likely slightly overfit but extremely precise. Little leakage with ratio on unknown predictor… Model B More conservative (~90% accuracy) Better generalization across customer segments. What surprised me most is not the accuracy even though I’m very happy 😆 But the model choice: HistGradientBoosting…? I wouldn’t even think of it… 🤔 I’ll publish my setup in GitHub : )🫰🏿⭐️⭐️⭐️
Doudou BA 🐙 tweet mediaDoudou BA 🐙 tweet media
Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English
1
4
9
764
OneNine
OneNine@OneNine_19X·
You don’t need Attention, All you need is OneNine : )
OneNine tweet media
English
0
0
0
25
OneNine
OneNine@OneNine_19X·
.@OneNine_19X We are building the data AI is missing.
Doudou BA 🐙@doudou_19X

Imagine in 5 or 10 years what all these dudes from @fdotinc #Artifact26 will become? Unicorns ? Many exits ? 2 or 3 x Founders already? Some will comeback and become @fdotinc partners? We will be called Alumnis. Kids be like «Oh this guy did #Artifact» We are writing the history! Happy to be here in this ecosystem enabling us to become the founders We always dreamt about being! Let’s go OneNine.dev #FoundersInc #Artifact26 #SanFrancisco

English
0
0
1
331
OneNine
OneNine@OneNine_19X·
OneNine 19X
OneNine tweet media
English
0
0
0
152
OneNine 리트윗함
Doudou BA 🐙
Doudou BA 🐙@doudou_19X·
One of the best parts of Founders Inc. is the people. Met my two brothers, thoughtful, driven, and genuinely good humans. This is why community matters. Thank you @nourzahzah and @yousefhll for the time and sharing.
Doudou BA 🐙 tweet media
English
2
3
23
647
OneNine 리트윗함
Doudou BA 🐙
Doudou BA 🐙@doudou_19X·
Just recorded a podcast at Founders Inc #Artifact in SF 🎙️ @fdotinc We’re building @OneNine_19X : the data infrastructure enabling AI to understand low- and mid-resource languages. AI shouldn’t work for only a fraction of the world. This is how we fix that.
English
2
7
54
2K
OneNine 리트윗함
Doudou BA 🐙
Doudou BA 🐙@doudou_19X·
Can’t believe just met YC Partner Gustaf Alstromer @gustaf ! I’m a big fan : ) @ycombinator
Doudou BA 🐙 tweet media
English
1
1
21
711