Anil Thomas

19 posts

Anil Thomas

Anil Thomas

@anlthms

Time to build...

Katılım Mayıs 2013
29 Takip Edilen75 Takipçiler
Anil Thomas
Anil Thomas@anlthms·
@omkizzy we used <suf_fim>{suffix}<pre_fim>{prefix}<mid_fim>{middle} (a.k.a SPM variant 1) in pre-training. you should still be able to use <pre_fim><suf_fim>{suffix}<mid_fim>{prefix}{middle} during inference as it is just PSM with an empty prefix.
English
1
0
1
39
omkaar
omkaar@omkizzy·
@anlthms awesome! it worked in my simple vibe tests well, so just wanted to confirm. just to confirm, SPM is <pre_fim><suf_fim>{suffix}<mid_fim>{prefix}{middle} right?
English
1
0
0
40
omkaar
omkaar@omkizzy·
hi @_saurabh @_jainyash, rnj-1 came out really well. I see PSM working for code FIM, has there been SPM format pre-training as well? SPM is better for KV caching
English
1
1
0
368
Anil Thomas retweetledi
Essential AI
Essential AI@essential_ai·
[1/2] We at Essential are driven by mission to advance fundamental research guided by first principles, rigor and sharing research openly.
English
1
10
31
5.4K
Anil Thomas retweetledi
Essential AI
Essential AI@essential_ai·
Why run the same race when we can pioneer our own path? Thats how we approach AI, by taking big bets and pushing on the foundations of AI 💥 Check out @ashVaswani's recent interview with @EconomicTimes
Essential AI tweet media
English
5
7
79
27.6K
Anil Thomas retweetledi
Ashish Vaswani
Ashish Vaswani@ashVaswani·
Check out our latest research on data. We're releasing 24T tokens of richly labelled web data. We found it very useful for our internal data curation efforts. Excited to see what you build using Essential-Web v1.0!
Essential AI@essential_ai

[1/5] 🚀 Meet Essential-Web v1.0, a 24-trillion-token pre-training dataset with rich metadata built to effortlessly curate high-performing datasets across domains and use cases!

English
24
82
653
145.3K
Anil Thomas retweetledi
Essential AI
Essential AI@essential_ai·
🗞️ We just launched our new landing page and dropped a fresh blog post on how LLMs learn to reflect and revise their thinking: In order to advance reasoning, it's vital to measure and understand its constituents, such as reflection. More to come - essential.ai
English
0
2
33
9.1K
Anil Thomas retweetledi
Ashish Vaswani
Ashish Vaswani@ashVaswani·
Reinforcement learning has shown success in eliciting reflection from LLMs, but what if this capability actually manifests earlier in pre-training? We investigated this question and our results are surprising 👇 [1/4]
Ashish Vaswani tweet media
English
13
100
806
137.7K
Anil Thomas
Anil Thomas@anlthms·
@sama see you on the leaderboard next year
GIF
English
0
0
4
642
will depue
will depue@willdepue·
scaling has hit a wall and that wall is 100% eval saturation.
English
58
64
2K
355.1K
317070
317070@317070·
Did you know, that you can build a virtual machine inside ChatGPT? And that you can use this machine to create files, program and even browse the internet? engraved.blog/building-a-vir…
English
217
2.1K
7.8K
0
Anil Thomas
Anil Thomas@anlthms·
@317070 The output seems incorrect for non-trivial commands, but it does a pretty good job of hallucinating what the output might look like.
Anil Thomas tweet media
English
1
0
3
0
Anil Thomas retweetledi
Luminide
Luminide@LuminideInc·
@karpathy You'd probably also like Luminide. It's just as easy to "get a GPU in the cloud", but also includes AI model dev features like Experiment Tracking and Hyperparameter Tuning. And it's available to individuals today! luminide.com/features
English
3
11
221
0