OneNine

1.5K posts

OneNine

@OneNine_19X

OneNine, we build the data AI is missing…

Sweden 가입일 Ocak 2022

64 팔로잉283 팔로워

고정된 트윗

OneNine@OneNine_19X·6 Kas

Winner🥇 of Big Angels Day - AI Category: #OneNine - @onenine_dev http://onenine.devDoudou BA - #Senegal @doudou_onenine

Doudou BA 🐙@doudou_19X

Winner🥇 of Big Angels Day - AI Category: #OneNine - @onenine_dev onenine.dev Doudou BA - #Senegal @doudou_onenine

English

1.8K

OneNine 리트윗함

Conner Tran@odyserve_19X·2d

Big moment 🚀 OneNine is nominated for the WSIS Prize 2026 (UN) 🌍 Support us here: itu.int/net4/wsis/stoc… Find us in AL C8 – Cultural diversity Takes 30s. Means a lot 🙏 @OneNine_19X #AI #Future #Inspiration @WSISprocess

English

OneNine@OneNine_19X·5d

#NewProfilePic

QME

OneNine 리트윗함

Sam Altman@sama·27 Mar

The coolest meeting I had this week with was Paul, who used ChatGPT and other LLMs to create an mRNA vaccine protocol to save his dog Rosie. It is amazing story. "The chat bots empowered me as an individual to act with the power of a research institute - planning, education, troubleshooting, compliance, and yes, real scientific design work in converting genomic data to a vaccine prescription and designing the treatment protocol around it. But they worked alongside humans at every step. The combination is what made it possible." It immediately got me thinking "this should be a company". Also, Paul is an extraordinary guy. This should be easy to do, but it is not yet.

Paul S. Conyngham@paul_conyngham

x.com/i/article/2036…

English

1.1K

496

1.7M

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·12 Mar

Did you know that Training LLMs or Transformers in Wolof is cheaper than English? Here's why: How a transformer processes text: text → tokens → embeddings → attention → FFN → repeat ×N layers The bottleneck: attention is O(n²) Every token attends to every other token. Double the tokens = 4x the compute. Now look at Wolof vs English: "I didn't hear you" → 5 tokens "Dégguma" → 1 token "What are you doing?" → 5 tokens "Looy deff?" → 2 tokens "Where is your father?" → 5 tokens "Ana sa baay?" → 3 tokens ~3x fewer tokens. Same semantics. 3² = 9x cheaper attention. Per layer. Per head. Wolof says one word = full sentence English. The transformer architecture rewards this directly. The language was never the problem. The dataset was. We're building it → onenine.dev

English

511

OneNine@OneNine_19X·10 Mar

Robert on stage at the Kigali Fintech Forum : ) You don’t need Attention, all you need is OneNine : ) OneNine.dev @MisterOkello @doudou_19X @odyserve_19X

English

465

OneNine@OneNine_19X·10 Mar

@doudou_19X @fdotinc @MisterOkello @odyserve_19X

QAM

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·10 Mar

That’s what you get when you do @fdotinc… let’s go OneNine.dev

English

463

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·10 Mar

⭐️⭐️⭐️ Did Machine Learning just die ? Or is it a Resurrection? I Spent last 2 weeks building a customer-daily forecast model… Revenue, cost, weight, volume… Classic problem with segmented forecasting. If you’ve worked with this type of problem you know already how hard it is… extreme volatility… event based… I tried everything: • complex feature engineering • large pipeline • gradient/lg boosting/prophete even deep learning 😅 Hours and hours of compute, overheating laptop (Mac M2), slow training… laptop end up crashing and mediocre accuracy… Sometimes a model worked for one customer but failed completely for another. Usually I’m an addict if I can’t solve something it irritates me… so spent my weekends/days/nights computing… But yesterday I decided to give up and asked my Boss: «I try again this week if it doesn’t work I park it?» He said «Not an option Dou» 😅 Then I’m like ok let me try @karpathy «autoresearch» setup… Gave the repo Opus4.6 with a little bit of context and it built the .md, the .py … everything… and… since I needed compute power I was running on @Microsoft #Fabric #Notebook + Azure Databricks #PySpark and called @claudeai… It came up with: 1.The exact features I needed 2.Multi model setup 3.Evaluates using rolling backtests 4.Updates the pipeline 5.Iterates again All automated… The interesting part: The system did not choose the models I expected. Instead it genuinely need what exact models would work for this and picked them… • HistGradientBoosting • RandomForest • ElasticNet fallback with rolling CV + lag features + seasonal signals. The system automatically iterated until models stops improving then updates the .md that updates the .py 🤩 so smart! The pipeline ended up doing: • strict time-based train/test splits • lag features (1, 7, 14, 28 days) ⭐️⭐️ • rolling medians for robustness • calendar features (day-of-week patterns) • model selection via rolling backtests The result Model A Very high backtest accuracy (~97%) Likely slightly overfit but extremely precise. Little leakage with ratio on unknown predictor… Model B More conservative (~90% accuracy) Better generalization across customer segments. What surprised me most is not the accuracy even though I’m very happy 😆 But the model choice: HistGradientBoosting…? I wouldn’t even think of it… 🤔 I’ll publish my setup in GitHub : )🫰🏿⭐️⭐️⭐️

Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

764

OneNine@OneNine_19X·9 Mar

You don’t need Attention, All you need is OneNine : )

English

OneNine@OneNine_19X·23 Şub

OneNine (19X) We don’t train models, We supply them : )

Doudou BA 🐙@doudou_19X

Guys ! I interviewed with @brycent 🤩 OneNine.dev @fdotinc @ycombinator @gustaf

English

142

OneNine@OneNine_19X·22 Şub

How about a OneNine Hackathon ? We open source a sample of OneNine never see Data and you guys build the best models ever ? Who’s in ? @OneNine_19X @doudou_19X @odyserve_19X OneNine.dev

Doudou BA 🐙@doudou_19X

Et si on faisait un OneNine Hackathon ? On met à votre disposition des données OneNine de hautes qualités et vous faites les meilleurs modèles du monde ? TTS/ASR/STS/NMT ? Ça vous dit ça ? Qui est partant ? @OneNine_19X ?

English

139

OneNine@OneNine_19X·14 Şub

OneNine.dev

Doudou BA 🐙@doudou_19X

Imagine interviewing with your favorite YC Partner @gustaf @ycombinator before an actual interview happens ? It’s possible with getminds.ai , you can clone anyone you would like to talk to : ) Congrats to my 2 buddies @lexdoudkin and @niklas_kant from @fdotinc ! Crossing fingers for @OneNine_19X

ZXX

367

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·7 Şub

Hey @alexandr_wang, I’m building @OneNine_19X, A Multilingual AI Training Data, kinda @scale_AI for mid/low resource… wanna talk ? OneNine.dev @Meta @metaai @ycombinator

English

3.9K

OneNine@OneNine_19X·2 Şub

.@OneNine_19X We are building the data AI is missing.

Doudou BA 🐙@doudou_19X

Imagine in 5 or 10 years what all these dudes from @fdotinc #Artifact26 will become? Unicorns ? Many exits ? 2 or 3 x Founders already? Some will comeback and become @fdotinc partners? We will be called Alumnis. Kids be like «Oh this guy did #Artifact» We are writing the history! Happy to be here in this ecosystem enabling us to become the founders We always dreamt about being! Let’s go OneNine.dev #FoundersInc #Artifact26 #SanFrancisco

English

331

OneNine@OneNine_19X·1 Şub

OneNine 19X

English

152

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·26 Oca

I'm in SF, with Founders Inc. @fdotinc to build what we call an #Artifact 🚀 @OneNine_19X ! Follow me with my series of releases for the next 3 weeks ! OneNine.dev @hthieblot @ycombinator @garrytan @mwseibel @daltonc @Meta @Google @GoogleDeepMind @AnthropicAI @OpenAI @deepseek_ai

English

540

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·27 Oca

One of the best parts of Founders Inc. is the people. Met my two brothers, thoughtful, driven, and genuinely good humans. This is why community matters. Thank you @nourzahzah and @yousefhll for the time and sharing.

English

647

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·28 Oca

Just recorded a podcast at Founders Inc #Artifact in SF 🎙️ @fdotinc We’re building @OneNine_19X : the data infrastructure enabling AI to understand low- and mid-resource languages. AI shouldn’t work for only a fraction of the world. This is how we fix that.

English

OneNine@OneNine_19X·30 Oca

@doudou_19X @gustaf @ycombinator OneNine.dev

QME

OneNine 리트윗함

Doudou BA 🐙@doudou_19X·30 Oca

Can’t believe just met YC Partner Gustaf Alstromer @gustaf ! I’m a big fan : ) @ycombinator

English

711

탐색

@WSISprocess @MisterOkello @doudou_19X @odyserve_19X @fdotinc @karpathy @Microsoft @claudeai