Matan Grinberg

1.8K posts

Matan Grinberg banner
Matan Grinberg

Matan Grinberg

@matanSF

ceo @FactoryAI

SF Katılım Ocak 2021
579 Takip Edilen15.4K Takipçiler
Matan Grinberg
Matan Grinberg@matanSF·
@leerob i think most interesting would be: 1. model performance, with harness held constant 2. harness performance, with model held constant
English
1
0
21
589
Lee Robinson
Lee Robinson@leerob·
@matanSF The numbers come from the leaderboard, I'm not sure it would make sense to include other harnesses outside Codex/CC. Open to ideas if there's a better way to represent things fairly!
Lee Robinson tweet media
English
1
0
7
911
Matan Grinberg
Matan Grinberg@matanSF·
excited to annouce the latest scores of OurAgent on OurAgentBench: 1. OurAgent 2. YourAgent arxiv paper in bio
English
11
5
189
13.4K
Matan Grinberg
Matan Grinberg@matanSF·
@leerob i know i know, but leading with OurAgentBench as the hero shot is a bit meh... also gpt5.4 and opus4.6 do way better than 75 and 58 respectively
English
1
0
15
1.5K
Lee Robinson
Lee Robinson@leerob·
@matanSF Okay fine I'll bite... We have Terminal-Bench and SWE-bench Multilingual in the blog post.
Lee Robinson tweet media
English
6
2
71
7.2K
Matan Grinberg
Matan Grinberg@matanSF·
a company unironically named "Droid Factory" is out in sf trying to raise money on a completely unrelated, not-petty note, excited to share my most recent domain purchase: droidfactory.com
English
12
0
110
15.9K
Matan Grinberg retweetledi
Factory
Factory@FactoryAI·
Factory was built from day one to deploy wherever your code already lives. On laptops, in CI pipelines, on VMs, inside Kubernetes clusters, and in networks with zero outbound internet connectivity. We support three deployment patterns, and you can mix them across teams: 1/ Cloud-managed 2/ Hybrid 3/ Fully air-gapped @nvidia, the US Government, and the world's largest financial institutions run Factory self-hosted today.
Factory tweet media
English
5
9
127
7.9K
Dave Font
Dave Font@davefontenot·
who's the next up-and-coming ramtin naimi
English
12
2
49
9.8K
Désirée Cachette
Désirée Cachette@DesireeCachette·
@matanSF I can’t disclose the startup but I did this once to a girl in my accelerator once she told me the “new name” they were going to use but forgot to lock down the domain and social handles
English
1
0
2
1.3K
Matan Grinberg retweetledi
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
I'm becoming a real Droid fan now! Here we have Calude Code vs Droid in Lego San Andreas glm-5-turbo edition! - 0-shot - same prompt, same model, same reasoning - stealing vehicle doesn't work in cc version - time of the day simulated in Droid version! Look at the image! 🤯
Ivan Fioravanti ᯅ tweet media
English
18
13
138
15.5K
Matan Grinberg retweetledi
Ivan Fioravanti ᯅ
Ivan Fioravanti ᯅ@ivanfioravanti·
Droid is really powerful 👀 Where did I live so far? @FactoryAI what's the magic behind this beast?
English
9
3
70
7.4K
Matan Grinberg
Matan Grinberg@matanSF·
Dune 3 🤝 Factory banning useEffect
Matan Grinberg tweet media
English
2
0
20
1.3K
Matan Grinberg
Matan Grinberg@matanSF·
"There is nothing new to be discovered in physics now. All that remains is more and more precise measurement." - Lord Kelvin, 1900, a few years before general relativity and quantum mechanics
Yuchen Jin@Yuchenj_UW

Some people at frontier AI labs told me they believe startups are over. OpenAI, Anthropic, Google, xAI will absorb every industry as AGI nears. Coding today, science, medicine, and finance next. Then everything else. If they’re right, that’s a pretty boring end of the world.

English
1
2
30
2.7K
Matan Grinberg
Matan Grinberg@matanSF·
@typesfast Idk I think Alexander and Aristotle probably spent three years just going with the flow not thinking much
English
0
0
18
2.8K
Ryan Petersen
Ryan Petersen@typesfast·
Marc severely underestimates the amount of time Alexander the Great spent reflecting on his conquests. He conquered 30+ kingdoms in 16 years. That left an average of 4-5 months on horseback as he marched to the next kingdom he was compelled to defeat. Plenty of time for introspection.
Ryan Petersen tweet media
English
98
111
2.7K
369.2K
Matan Grinberg retweetledi
0xSero
0xSero@0xSero·
How Droid & I do all frontend work these days, I don't want the bots to touch code until they have something to compare their work against. This seems to work much better than just letting them do whatever they think is right.
English
10
7
121
8.9K