Jon Tow (@jonbtow) - Twitter Profili | Zamantika Mersobahis Locabet

Jon Tow retweetledi

Nous Research@NousResearch·29 Nis

Reinforcement Learning in the era of LLMs requires scalable, distributed systems to push the boundaries of reasoning and alignment. Today - we release Atropos - our RL environments framework. github.com/NousResearch/A… Atropos is a rollout framework for reinforcement learning with foundation models that supports complex and diverse environments for advancing the capabilities of foundation models. In Greek mythology, Atropos was the eldest of the three Fates. While her sisters spun and measured the threads of mortal lives, Atropos alone held the shears that would cut these threads, determining the final destiny of each soul. Just as Atropos guided souls to their ultimate fate, this system guides language models toward their optimal potential through reinforcement learning. The work on Atropos was led by @dmayhem93 and built alongside @teknium, @rogershijin, @max_paperclips, @nullvaluetensor, @JSupa15, @artemsya and @karan4d

English

49

148

869

606.3K

Jon Tow retweetledi

dmayhem93@dmayhem93·26 Nis

Glad to see async LLM RL getting more recognition! Our Towards System 2 paper highlights its importance under infrastructure: arxiv.org/abs/2501.04682. Also, NeoX already supports this (github.com/EleutherAI/gpt…) with shared memory, allowing you to save vram by using the same memory for training and inference, making in-flight weight updates instant

🇺🇦 Dzmitry Bahdanau@DBahdanau

I am excited to open-source PipelineRL - a scalable async RL implementation with in-flight weight updates. Why wait until your bored GPUs finish all sequences? Just update the weights and continue inference! Code: github.com/ServiceNow/Pip… Blog: huggingface.co/blog/ServiceNo…

English

0

9

55

6.1K

Jon Tow@jonbtow·12 Mar

@aman_gif spaces.ac.cn/archives/9577

QME

1

0

5

75

👁‍ ɐɯɐu 👁‍@aman_gif·12 Mar

why qwen use bias in qkv?

English

2

0

4

244

Jon Tow@jonbtow·4 Ara

6ND matches reality until n_ctx ~= 32K . Brought to you by "flex-attention breaks FlopCounterMode."

English

1

0

5

527

Jon Tow@jonbtow·28 Kas

@shxf0072 @torchcompiled @danielhanchen earlier #diff-1ec36c18d77be0fa276598d233f250404a8c9a80fd2d5956988d5edabe6332edR34" target="_blank" rel="nofollow noopener">github.com/Dao-AILab/flas…

English

0

1

14

Joey (e/λ)@shxf0072·28 Kas

@torchcompiled @danielhanchen found it earlier, unsloth.ai/blog/gemma

English

1

0

3

110

Jon Tow@jonbtow·27 Kas

god bless torch.testing._internal.common_distributed.spawn_threads_and_init_comms

English

0

2

278

Jon Tow@jonbtow·24 Kas

@teknium github.com/WecoAI/aideml

QME

0

1

124

Teknium (e/λ)@Teknium·24 Kas

What is AIDE

METR@METR_Evals

How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.

English

8

0

16

6K

Jon Tow retweetledi

Aran Komatsuzaki@arankomatsuzaki·21 Eki

Generative Reward Models Presents GenRM, an iterative algorithm that trains an LLM on self-generated reasoning traces, leading to synthetic preference labels matching human preference judgments arxiv.org/abs/2410.12832

English

10

80

427

41.2K

Jon Tow retweetledi

Hailey Schoelkopf@haileysch__·24 May

My favorite bit in this paper: I and @bbrabbasi wrote an appendix formalizing what is done evaluating models with loglikelihood multiple choice and perplexity evals. afaik, none of this has been written up in one place in most papers and just been tacitly assumed before!

EleutherAI@AiEleuther

Excited to share our new paper, Lessons From The Trenches on Reproducible Evaluation of Language Models! In it, we discuss common challenges we’ve faced evaluating LMs, and how our library the Evaluation Harness is designed to mitigate them 🧵 arxiv.org/abs/2405.14782

English

9

35

239

36K

Jon Tow@jonbtow·18 Nis

@abacaj No. Pure NL. The way Terry Winograd would like it.

English

0

1

358

anton@abacaj·18 Nis

@jonbtow no math, code?

English

1

0

381

Jon Tow@jonbtow·10 Nis

@TeraflopAI @Geronimo_AI @dmayhem93 @StabilityAI Yup! We use the FLAN sources from huggingface.co/datasets/DataP… collected by TeraflopAI's very own @EnricoShippole. It's mentioned in our 1.6B technical report and tagged in the metadata of all Stable LM 2 base model cards.

English

0

2

99

Jon Tow retweetledi

Enrico Shippole@EnricoShippole·9 Nis

We are releasing trillions of high-quality, copyright-free, permissively licensed tokens and multimodal data. Be sure to follow our releases @TeraflopAI.

English

1

7

44

5.1K

Jon Tow@jonbtow·9 Nis

Pre-train mix in arxiv.org/abs/2402.17834 if you want to replay some for continued pre-training. Open data mix >>>

Stability AI@StabilityAI

Stable LM 2 12B is a pair of powerful 12 billion parameter language models trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch, featuring a base and instruction-tuned model. You can now try the model here: huggingface.co/stabilityai/st… (1/3)

English

0

3

12

542

Jon Tow retweetledi

Stability AI@StabilityAI·8 Nis

Stable LM 2 12B is a pair of powerful 12 billion parameter language models trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch, featuring a base and instruction-tuned model. You can now try the model here: huggingface.co/stabilityai/st… (1/3)

English

14

97

520

225.5K

Jon Tow retweetledi

Reshinth@reshinth_·7 Nis

How to define Diversity in the context of CodeLMs and Programming Languages ? 1. Diversity is positively correlated with Performance in solving a problem. 2. Shortcomings of diversity in small codeLMs. 3. Code Embedding models don't capture semantics. reshinthadithyan.github.io/blog/2023/code…

English

1

9

24

4.5K

Jon Tow retweetledi

Aran Komatsuzaki@arankomatsuzaki·8 Mar

@TeraflopAI is excited to help support the @caselawaccess and @HarvardLIL, in the release of over 6.6 million state and federal court decisions published throughout U.S. history 🥳

English

1

7

33

5K

Jon Tow retweetledi

Enrico Shippole@EnricoShippole·31 Ağu

Releasing Yarn-Llama-2-13b-128k, a Llama-2 model, trained for 128k context length using YaRN scaling. The model was trained in collaboration with u/bloc97 and @theemozilla of @NousResearch and @Void13950782 of @AiEleuther.

English

24

166

757

258.9K

Jon Tow retweetledi

Enrico Shippole@EnricoShippole·7 Haz

Releasing a new PaLM 2.1b model trained at a context length of 8k on C4. This model release is a continuation of the previously released 150m, 410m, and 1b models. twitter.com/EnricoShippole…

Enrico Shippole@EnricoShippole

Introducing three new open-source PaLM models trained at a context length of 8k on C4. Open-sourcing LLMs is a necessity for the fair and equitable democratization of AI. The models of sizes 150m, 410m, and 1b are available to download and use here: github.com/conceptofmind/…

English

2

22

137

56.2K

Jon Tow retweetledi

Enrico Shippole@EnricoShippole·8 May

Introducing three new open-source PaLM models trained at a context length of 8k on C4. Open-sourcing LLMs is a necessity for the fair and equitable democratization of AI. The models of sizes 150m, 410m, and 1b are available to download and use here: github.com/conceptofmind/…

English

6

103

519

164.5K

Jon Tow retweetledi

Stella Biderman@BlancheMinerva·23 Şub

#ML #AI #NLProc Researchers, especially academics: I’m writing a paper about accessible finetuning of transformer models. If you’re looking to finetune a LLM, where on this list of compute access do you fall? If you don’t fall on this list, what resources are accessible to you?

English

63

43

288

75.1K

Jon Tow

Keşfet