Alex Gajewski

98 posts

Alex Gajewski banner
Alex Gajewski

Alex Gajewski

@apagajewski

Building a preschool for robots @pantographPBC. Previously cofounder @sfcompute, @ExaAILabs

San Francisco Katılım Haziran 2014
907 Takip Edilen2.4K Takipçiler
Alex Gajewski retweetledi
Standard Intelligence
Standard Intelligence@si_pbc·
Computer use models shouldn't learn from screenshots. We built a new foundation model that learns from video like humans do. FDM-1 can construct a gear in Blender, find software bugs, and even drive a real car through San Francisco using arrow keys.
GIF
English
186
404
3.9K
1.1M
Alex Gajewski retweetledi
Mox SF
Mox SF@moxspace·
Weird opp??? Mox used to be a city records center, so we inherited a pretty legit server room. - 120A @ 240V - 5 ton cooling unit - 100kW diesel genny w/ 1000-gal tank - 2.5Gb sym fiber Can get it live in ~1 month. Who needs serious on-prem infra in SF?
Mox SF tweet media
English
2
4
25
3.9K
Alex Gajewski retweetledi
Pantograph
Pantograph@pantographPBC·
Merry Christmas from your favorite robots! 🎄
English
6
12
67
36.6K
Alex Gajewski
Alex Gajewski@apagajewski·
@gwern What do you imagine such a process being applied to at that level of overhead? Even at 10x overhead I have a hard time coming up with applications
English
1
0
0
262
Alex Gajewski
Alex Gajewski@apagajewski·
I wonder what you would get if you trained something Cycle-GAN-like between images and music. Probably possible today with the quality of generative models we have!
English
0
0
5
2.1K
Alex Gajewski
Alex Gajewski@apagajewski·
The new google image model is quite good except for the fact that it doesn't like to draw physicists:
Alex Gajewski tweet mediaAlex Gajewski tweet media
English
0
0
3
1.7K
Alex Gajewski
Alex Gajewski@apagajewski·
This one seems like a good idea to me, increasingly I think datasets and RL environments are the limiting factor:
Y Combinator@ycombinator

Devtools for AI Agents @dessaigne AI agents are the next wave: autonomous tools that reason, decide, and amplify human productivity. We’re funding startups building devtools for agents, whether you’re creating agent builders or building blocks to perform complex tasks.

English
1
0
2
1.6K
Alex Gajewski
Alex Gajewski@apagajewski·
@distributionat Yeah, things in that direction. I think OpenAI is likely to be a bit too conservative with what they let Operator do.
English
0
0
2
56
toucan
toucan@distributionat·
@apagajewski By computer control do you mean something like Operator?
English
1
0
1
107
Alex Gajewski
Alex Gajewski@apagajewski·
Feels like a good time to start a computer control startup. The methods are generally known (RL on top of base models), and it probably doesn't require that much compute, just thoughtful environment design. I would probably start with a text-only representation of websites.
English
1
0
8
936
Alex Gajewski
Alex Gajewski@apagajewski·
I hope that somebody starts a company to make an AI-native smartwatch. It feels to me like the ideal form factor for most of what I want a language model to do.
English
0
0
2
883
Alex Gajewski
Alex Gajewski@apagajewski·
Very excited for this new cluster. Big enough to train R1, but it's running our combinatorial auction so the prices should be rational
evan conrad@evanjconrad

Hey friends, we're excited to announce that an additional 2,000 H100s will be added @sfcompute's on-demand market. It's the largest* interconnected cluster, from any provider (including hyperscalers), that you can get on a per hour basis. You're not locked in with San Francisco Compute. If DeepSeek can compete with OpenAI using 2,000 H800s, you too can train a state of the art RL model without ever having to sign a long-term contract that you can't exit. You could have trained DeepSeek-v3 for $4.5m for 1.5mo on SFC or $35m if you could only buy a 1 year contract off market. This was the dream Alex & I had since our audio model company (Junelark) died because it couldn't procure enough GPUs, and it's what we've been working towards for nearly two years. Long-term contracts are a trap; they make it so only the biggest of the big can compete in AI. They force startup founders to raise at massive valuations pre-revenue, which dilutes founders and employees and sets them up to fail when they can't raise their next round. This cluster will roll out over the next few weeks as we scale our infrastructure. Soon you'll be able to access it via our managed Kubernetes service or by reaching out to set up a custom solution. We're also exploring other ways of partnering with service providers to let them offer GPU-based services, like workers and inference endpoints, without being forced into a long-term contract with a hyperscaler. You no longer need to bet your company on GPU prices to offer GPU-based services. * We think! If you know of a larger, please correct us!

English
0
0
17
1.5K
Alex Gajewski
Alex Gajewski@apagajewski·
very excited the weights of o1 finally arrived
Alex Gajewski tweet media
English
2
0
14
1.1K
Alex Gajewski
Alex Gajewski@apagajewski·
One part of SF Compute we haven’t talked about very much yet is that post-AGI (presumably soon), the models will want to train more models. (Really, people will ask the first models to train more models, or perhaps to solve tasks that would benefit from, say, some custom RL). It will probably be most natural for those models to buy compute from a liquid market, where they can get precisely the compute they need for each run they need to do.
English
0
0
10
1.2K
Alex Gajewski
Alex Gajewski@apagajewski·
Has anyone tried “sub-token attention”? Artificially increase the sequence length by including K copies of each token next to each other (say, each linearly projected by a different map), and let the different copies attend to each other. True self-attention :P (And then at the output project back to a single token to combine)
English
0
0
3
635