Jon Tow

31 posts

Jon Tow banner
Jon Tow

Jon Tow

@jonbtow

Katılım Ocak 2016
242 Takip Edilen190 Takipçiler
Jon Tow retweetledi
Nous Research
Nous Research@NousResearch·
Reinforcement Learning in the era of LLMs requires scalable, distributed systems to push the boundaries of reasoning and alignment. Today - we release Atropos - our RL environments framework. github.com/NousResearch/A… Atropos is a rollout framework for reinforcement learning with foundation models that supports complex and diverse environments for advancing the capabilities of foundation models. In Greek mythology, Atropos was the eldest of the three Fates. While her sisters spun and measured the threads of mortal lives, Atropos alone held the shears that would cut these threads, determining the final destiny of each soul. Just as Atropos guided souls to their ultimate fate, this system guides language models toward their optimal potential through reinforcement learning. The work on Atropos was led by @dmayhem93 and built alongside @teknium, @rogershijin, @max_paperclips, @nullvaluetensor, @JSupa15, @artemsya and @karan4d
Nous Research tweet media
English
49
148
869
606.3K
Jon Tow retweetledi
dmayhem93
dmayhem93@dmayhem93·
Glad to see async LLM RL getting more recognition! Our Towards System 2 paper highlights its importance under infrastructure: arxiv.org/abs/2501.04682. Also, NeoX already supports this (github.com/EleutherAI/gpt…) with shared memory, allowing you to save vram by using the same memory for training and inference, making in-flight weight updates instant
🇺🇦 Dzmitry Bahdanau@DBahdanau

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: github.com/ServiceNow/Pip… Blog: huggingface.co/blog/ServiceNo…

English
0
9
55
6.1K
Jon Tow
Jon Tow@jonbtow·
6ND matches reality until n_ctx ~= 32K . Brought to you by "flex-attention breaks FlopCounterMode."
Jon Tow tweet media
English
1
0
5
527
Jon Tow
Jon Tow@jonbtow·
god bless torch.testing._internal.common_distributed.spawn_threads_and_init_comms
English
0
0
2
278
Jon Tow retweetledi
Aran Komatsuzaki
Aran Komatsuzaki@arankomatsuzaki·
Generative Reward Models Presents GenRM, an iterative algorithm that trains an LLM on self-generated reasoning traces, leading to synthetic preference labels matching human preference judgments arxiv.org/abs/2410.12832
Aran Komatsuzaki tweet media
English
10
80
427
41.2K
Jon Tow retweetledi
Hailey Schoelkopf
Hailey Schoelkopf@haileysch__·
My favorite bit in this paper: I and @bbrabbasi wrote an appendix formalizing what is done evaluating models with loglikelihood multiple choice and perplexity evals. afaik, none of this has been written up in one place in most papers and just been tacitly assumed before!
Hailey Schoelkopf tweet media
EleutherAI@AiEleuther

Excited to share our new paper, Lessons From The Trenches on Reproducible Evaluation of Language Models! In it, we discuss common challenges we’ve faced evaluating LMs, and how our library the Evaluation Harness is designed to mitigate them 🧵 arxiv.org/abs/2405.14782

English
9
35
239
36K
Jon Tow
Jon Tow@jonbtow·
@abacaj No. Pure NL. The way Terry Winograd would like it.
English
0
1
1
358
Jon Tow retweetledi
Enrico Shippole
Enrico Shippole@EnricoShippole·
We are releasing trillions of high-quality, copyright-free, permissively licensed tokens and multimodal data. Be sure to follow our releases @TeraflopAI.
English
1
7
44
5.1K
Jon Tow retweetledi
Stability AI
Stability AI@StabilityAI·
Stable LM 2 12B is a pair of powerful 12 billion parameter language models trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch, featuring a base and instruction-tuned model. You can now try the model here: huggingface.co/stabilityai/st… (1/3)
English
14
97
520
225.5K
Jon Tow retweetledi
Reshinth
Reshinth@reshinth_·
How to define Diversity in the context of CodeLMs and Programming Languages ? 1. Diversity is positively correlated with Performance in solving a problem. 2. Shortcomings of diversity in small codeLMs. 3. Code Embedding models don't capture semantics. reshinthadithyan.github.io/blog/2023/code…
Reshinth tweet mediaReshinth tweet media
English
1
9
24
4.5K
Jon Tow retweetledi
Jon Tow retweetledi
Enrico Shippole
Enrico Shippole@EnricoShippole·
Introducing three new open-source PaLM models trained at a context length of 8k on C4. Open-sourcing LLMs is a necessity for the fair and equitable democratization of AI. The models of sizes 150m, 410m, and 1b are available to download and use here: github.com/conceptofmind/…
English
6
103
519
164.5K
Jon Tow retweetledi
Stella Biderman
Stella Biderman@BlancheMinerva·
#ML #AI #NLProc Researchers, especially academics: I’m writing a paper about accessible finetuning of transformer models. If you’re looking to finetune a LLM, where on this list of compute access do you fall? If you don’t fall on this list, what resources are accessible to you?
Stella Biderman tweet media
English
63
43
288
75.1K