Magic

28 posts

Magic banner
Magic

Magic

@magicailabs

Long-context, test-time compute, and e2e Reinforcement Learning to build a superhuman coding agent (that then builds the rest of AGI for us). Join us https://t.co/hGZKtUzsR3

San Francisco Katılım Nisan 2022
0 Takip Edilen15.8K Takipçiler
Sabitlenmiş Tweet
Magic
Magic@magicailabs·
LTM-2-Mini is our first model with a 100 million token context window. That’s 10 million lines of code, or 750 novels. Full blog: magic.dev/blog/100m-toke… Evals, efficiency, and more ↓
English
171
426
2.7K
1.6M
Magic
Magic@magicailabs·
Excited to announce we’re building an Applied Team focused on post-training. Come explore what's possible with our new (and still unreleased) LTM2 models and their 100M token context window. Apply here: magic.dev/careers/5652b4…
English
18
8
113
42.5K
Magic
Magic@magicailabs·
Very excited to welcome @nvidia as Magic's latest investor! With their support, we’re looking forward to scaling long context and inference-time compute.
English
9
7
155
44.2K
Magic
Magic@magicailabs·
With context solved, we now focus on unbounded inference-time compute as the next (and potentially last) breakthrough we believe is needed to build reliable AGI. Imagine if you could spend $100 and 10 minutes on one task and reliably get a great pull request for an entire feature. That’s our goal. We are 23 people (+ 8000 H100s) working on a single project: co-designing for long context, inference-time compute, and end-to-end RL to automate coding and research. Ben Chess (fmr. OpenAI supercomputing lead) just joined to help us scale and we’re hiring more engineers and researchers across ML, CUDA, infra, security, and more: magic.dev/careers
English
21
24
453
51.3K
Magic
Magic@magicailabs·
Our LTM (Long Term Memory) mechanism needs >1,000x less compute and memory than Llama 3.1 405B’s attention. Llama 3.1 would need 638 H100s *per user* to store a 100M token KV cache. LTM needs a small fraction of one. SSMs, RNNs, and RAG all exploit weaknesses in evals like Needle In A Haystack, so we made a new eval, HashHop: 1) Incompressible 2) Multi-hop 3) No semantic hints 4) No recency bias
Magic tweet media
English
22
28
400
54K
Magic
Magic@magicailabs·
LTM-2-Mini is our first model with a 100 million token context window. That’s 10 million lines of code, or 750 novels. Full blog: magic.dev/blog/100m-toke… Evals, efficiency, and more ↓
English
171
426
2.7K
1.6M
Magic retweetledi
Eric Steinberger
Eric Steinberger@EricSteinb·
Very excited to welcome @karpathy as Magic's latest investor!
English
39
46
1.2K
292.9K
Magic retweetledi
Hersh Desai
Hersh Desai@Hersh_Desai·
The era of long context is upon us. The question is whether you want to be 1 of 1000 co-authors on the Gemini paper or 1 of <20 building at Magic.dev
Jeff Dean@JeffDean

Needle in a Haystack tests The tech report also details a number of microbenchmark “needle in a haystack” tests (modeled after @GregKamradt’s github.com/gkamradt/LLMTe…) that probe the model’s ability to retrieve specific information from its context. For text, Gemini 1.5 Pro achieves 100% recall up to 530k tokens, 99.7% up to 1M tokens, and 99.2% accuracy up to 10M tokens.

English
3
6
90
45K
Magic retweetledi
Hersh Desai
Hersh Desai@Hersh_Desai·
I have been continuously in awe of the brilliance, tenacity, and kindness of @EricSteinb and the small but mighty team at Magic.dev. So much so that we've decided to invest $100m! If you're interested in building the future, please do reach out to me or the team!
Hersh Desai tweet media
Nat Friedman@natfriedman

Magic.dev has trained a groundbreaking model with many millions of tokens of context that performed far better in our evals than anything we've tried before. They're using it to build an advanced AI programmer that can reason over your entire codebase and the transitive closure of your dependency tree. If this sounds like magic... well, you get it. Daniel and I were so impressed, we are investing $100M in the company today. The team is intensely smart and hard-working. Building an AI programmer is both self-evidently valuable and intrinsically self-improving. If this sounds interesting to you, consider joining them!

English
6
7
70
30.7K
Magic
Magic@magicailabs·
@natfriedman Thank you for your support, Nat!! :)
English
2
0
9
5.8K
Nat Friedman
Nat Friedman@natfriedman·
Magic.dev has trained a groundbreaking model with many millions of tokens of context that performed far better in our evals than anything we've tried before. They're using it to build an advanced AI programmer that can reason over your entire codebase and the transitive closure of your dependency tree. If this sounds like magic... well, you get it. Daniel and I were so impressed, we are investing $100M in the company today. The team is intensely smart and hard-working. Building an AI programmer is both self-evidently valuable and intrinsically self-improving. If this sounds interesting to you, consider joining them!
Nat Friedman tweet media
English
77
182
2.1K
534.7K
Magic retweetledi
Eric Steinberger
Eric Steinberger@EricSteinb·
I love my team a lot and sometimes it’s stressful but life has never been so fulfilling. If you want to build AGI on a small team of people who care a lot with thousands of GPUs, please apply :) Magic.dev
Magic@magicailabs

We've raised $117M from @natfriedman and others to build an AI software engineer. Code generation is both a product and a path to AGI, requiring new algorithms, lots of CUDA, frontier-scale training, RL, and a new UI. We are hiring!

English
19
12
180
87.4K
Magic
Magic@magicailabs·
If you want to solve very hard problems to build safe AGI on a small team with thousands of GPUs, come join us: Magic.dev!
English
4
5
39
16.3K
Magic
Magic@magicailabs·
We've raised $117M from @natfriedman and others to build an AI software engineer. Code generation is both a product and a path to AGI, requiring new algorithms, lots of CUDA, frontier-scale training, RL, and a new UI. We are hiring!
Magic tweet media
English
44
85
681
459.1K
Magic retweetledi
Riley Goodside
Riley Goodside@goodside·
5M tokens of context. Let that sink in. Yes, there's caveats. But consider what's to come: - Entire codebases in prompts - Novel-length spec docs as instructions - k-shots where k = 10K - Few-shots where each "shot" is 50K LoC → diff Those who declared the imminent death of prompt engineering before long-context models existed have betrayed a lack of imagination. We have not yet begun to prompt.
Magic@magicailabs

Meet LTM-1: LLM with *5,000,000 prompt tokens* That's ~500k lines of code or ~5k files, enough to fully cover most repositories. LTM-1 is a prototype of a neural network architecture we designed for giant context windows.

English
19
79
617
120.1K
Magic
Magic@magicailabs·
Meet LTM-1: LLM with *5,000,000 prompt tokens* That's ~500k lines of code or ~5k files, enough to fully cover most repositories. LTM-1 is a prototype of a neural network architecture we designed for giant context windows.
English
52
176
1K
462.3K
Magic retweetledi
Eric Steinberger
Eric Steinberger@EricSteinb·
AI with long-term memory! *A lot* of work left to do but happy to share a little more about what we've been up to. It's been incredibly fulfilling to work with a wonderful team and the trust of our backers towards this milestone. Thank you for the opportunity <3
Magic@magicailabs

Meet LTM-1: LLM with *5,000,000 prompt tokens* That's ~500k lines of code or ~5k files, enough to fully cover most repositories. LTM-1 is a prototype of a neural network architecture we designed for giant context windows.

English
13
7
105
35.7K
Magic
Magic@magicailabs·
What’s next? More compute. LTM Nets see more context than GPTs, but LTM-1 has fewer parameters than today’s frontier models, making it less smart. Knowing how drastically model scale improves the performance of GPTs, we're excited to see how far we can take LTM Nets.
English
0
3
57
12.5K
Magic
Magic@magicailabs·
How? We tried to scale standard GPT context windows but quickly got stuck. So, we designed a new approach: the Long-term Memory Network (LTM Net). Training and serving LTM Nets required a custom ML stack, from GPU kernels to how we distribute the model across a cluster.
English
2
7
112
23.3K