NOTBAD AI

32 posts

NOTBAD AI

NOTBAD AI

@notbadai

Katılım Temmuz 2024
25 Takip Edilen103 Takipçiler
Sabitlenmiş Tweet
NOTBAD AI
NOTBAD AI@notbadai·
We've open-sourced our internal AI coding IDE. We built this IDE to help with coding and to experiment with custom AI workflows. It's based on a flexible extension system, making it easy to develop, test, and tweak new ideas quickly. Each extension is a Python script that runs locally. (Links in replies) 🧶👇
NOTBAD AI tweet mediaNOTBAD AI tweet media
English
2
5
11
3.1K
NOTBAD AI
NOTBAD AI@notbadai·
3/ You can quickly switch between different chat extensions and models using a dropdown menu. Our default chat extension uses @morphllm. For complex code where the chat doesn't work as well, you can use extensions that suggest inline autocompletions (multiline). We also use @Alibaba_Qwen coder models for inline completions. Extensions can have simple UIs too. For instance, a simple Git commit UI that auto-suggests commit messages that match our format. 👇
English
1
1
2
514
NOTBAD AI
NOTBAD AI@notbadai·
We've open-sourced our internal AI coding IDE. We built this IDE to help with coding and to experiment with custom AI workflows. It's based on a flexible extension system, making it easy to develop, test, and tweak new ideas quickly. Each extension is a Python script that runs locally. (Links in replies) 🧶👇
NOTBAD AI tweet mediaNOTBAD AI tweet media
English
2
5
11
3.1K
NOTBAD AI retweetledi
Georges Harik
Georges Harik@gharik·
Because we started from quiet star we had developed and have been using a slightly different version of grpo. We believe it may be a lower variance gradient and therefore possibly increase stability. We will write this up later but wanted to share it now so other people can use the idea. For each problem, we produce N1 chain of thoughts but then after that for each chain of thought we produce N2 answers, which for some domains is pretty cheap because answers are much shorter. Then we compute an advantage per answer for each thought and we normalize inside the answer group, and use that to reinforce the answerer. Then following that we produce a cot advantage per thought within a problem and use that to reinforce the thought. The increase in the number of answers gets better, as in lower variance, advantages for both the answers as well as the thoughts.
English
9
7
66
12.8K
NOTBAD AI retweetledi
vpj
vpj@vpj·
The new training also improved GPQA from 64.2% to 67.3% and MMLU Pro from 64.2% to 67.3%. This model was also trained with the same reasoning datasets we used to train the v1.0 model. We mixed more general instruction data with answers sampled from the Mistral-Small-24B-Instruct-2501 model during the SFT to reduce the degradation of IFEval, which seems to have resulted in generalization of reasoning to non math and coding problems. The datasets and the models are available on @huggingface. Follow @notbadai for updates.
NOTBAD AI@notbadai

We are releasing an updated reasoning model with improvements on IFEval scores (77.9%) than our previous model (only 51.4%). 👇 Links to try the model and to download weights below

English
1
6
7
3.1K
NOTBAD AI
NOTBAD AI@notbadai·
We are releasing an updated reasoning model with improvements on IFEval scores (77.9%) than our previous model (only 51.4%). 👇 Links to try the model and to download weights below
NOTBAD AI tweet mediaNOTBAD AI tweet media
English
1
6
10
5.5K
NOTBAD AI retweetledi
Lambda
Lambda@LambdaAPI·
Multi-node NVIDIA HGX B200-accelerated clusters are available NOW, on-demand through Lambda 1-Click Clusters.
English
1
11
79
301.2K
NOTBAD AI retweetledi
Lambda
Lambda@LambdaAPI·
Innovate faster with self-serve, on-demand access to multi-node NVIDIA HGX B200-accelerated clusters.
English
1
7
44
245.1K
NOTBAD AI
NOTBAD AI@notbadai·
We just released a Python coding reasoning dataset with 200k samples on @huggingface This was generated by our RL-based self-improved Mistral 24B 2501 model. This dataset was used to train train Notbad v1.0 Mistral 24B. 🤗 Links in replies 👇
NOTBAD AI tweet media
English
2
7
19
4.6K
NOTBAD AI retweetledi
vpj
vpj@vpj·
Uploaded the dataset of 270k math reasoning samples that we used to finetune Notbad v1.0 Mistral 24B (MATH-500=77.52% GSM8k Platinum=97.55%) to @huggingface (link in reply) Follow @notbadai for updates
English
9
13
62
14.5K
NOTBAD AI
NOTBAD AI@notbadai·
We're open-sourcing a math reasoning dataset with 270k samples, generated by our RL-based self-improved Mistral 24B 2501 model and used to train Notbad v1.0 Mistral 24B. Available on Hugging Face: huggingface.co/datasets/notba…
English
0
4
8
1.6K
NOTBAD AI
NOTBAD AI@notbadai·
Thanks to @LambdaAPI and @deepinfra for providing help with compute resources for our research and training this model.
English
0
0
1
114
NOTBAD AI
NOTBAD AI@notbadai·
📢 We are excited to announce Notbad v1.0 Mistral 24B, a new reasoning model trained in math and Python coding. This model is built upon the @MistralAI Small 24B 2501 and has been further trained with reinforcement learning on math and coding.
NOTBAD AI tweet mediaNOTBAD AI tweet mediaNOTBAD AI tweet media
English
1
8
22
9.1K