Mike Bilodeau

2.1K posts

Mike Bilodeau banner
Mike Bilodeau

Mike Bilodeau

@mj_bilodeau

Time and Tide | Marketing @Basetenco

SF Katılım Mart 2015
606 Takip Edilen757 Takipçiler
AT
AT@AliesTaha·
@philipkiely what is inference? how does it work? @philipkiely can i come to learn (and also maybe get ice-cream)?
English
2
0
5
271
Philip Kiely
Philip Kiely@philipkiely·
Ice cream and books were a hit yesterday. ICYMI we're doing another, this time at the Ferry Building. Thursday 4/2 from 2-4 PM: luma.com/khxc93ju
Philip Kiely tweet mediaPhilip Kiely tweet mediaPhilip Kiely tweet media
English
2
1
25
2K
Mike Bilodeau
Mike Bilodeau@mj_bilodeau·
the AI sdr urge to start off every email by congratulating someone for the exact dollar amount and valuation of their last fundraise
English
1
0
2
94
Mike Bilodeau retweetledi
Madison Kanna
Madison Kanna@Madisonkanna·
What is AI inference engineering, why is it such an in-demand skill, and how do you break into the field? With author of Inference Engineering @philipkiely and head of training at Baseten @oneill_c 0:00: What is inference? 2:47: History of inference 4:59: Downstream effects of AI research on inference 13:54: What you'll learn from Inference Engineering 16:14: Advice for engineers transitioning into AI 19:00: Open source models driving inference growth 20:55: Specialization vs. frontier closed models 23:51: "Big Token" and the importance of open source AI 27:18: Where to get Inference Engineering
English
40
68
699
69.5K
Monica L
Monica L@mon__lim·
addicted to @baseten ice cream. hope they make this a staple. inference has never tasted so good
English
2
0
12
768
Evan Moore
Evan Moore@evancharles·
Would love to know which genius in growth at @baseten is responsible for sponsoring an ice cream flavor
Evan Moore tweet media
English
8
1
51
6.3K
Zed
Zed@zeddotdev·
Your AI code completions in Zed show up in ~200ms. That's Zeta, our Edit Prediction model, running on @baseten. We love partnering with companies who keep the bar high — Baseten is one of them.
English
31
26
821
93.9K
Rachel Rapp
Rachel Rapp@rapprach·
Had a little too much caffeine this morning Come say hi at KubeCon! Booth 585 💚
Rachel Rapp tweet media
English
1
2
32
1.2K
Mike Bilodeau retweetledi
Baseten
Baseten@baseten·
We are thrilled to welcome Sameer Paranjpye to lead our engineering organization. Welcome, Sameer! baseten.co/blog/welcome-s…
Baseten tweet media
English
2
6
41
7.8K
Austen Allred
Austen Allred@Austen·
New logo wall for our website what do you think?
Austen Allred tweet media
English
68
50
3.1K
87.5K
Lan
Lan@ad0rnai·
finally met my favorite tech egirl @netcapgirl
Lan tweet media
English
10
1
80
8.7K
Mike Bilodeau
Mike Bilodeau@mj_bilodeau·
@oneill_c dang and here i thought karpathy just oneshotted you guys out of a job
English
0
0
0
28
Charlie O'Neill
Charlie O'Neill@oneill_c·
Thoughts on what makes autoresearch work and where you shouldn't expect magic Once you have a clearly defined metric and a way to normalise experiments ie usually wall clock time; you can't do steps or flops or tokens or whatever. But this is important, because no matter how the model decides to try and reward hack, every experiment is directly comparable. For example, if you fix the steps and our autoresearch agent tries increasing the size of the model you get less gradient updates per second; same with tokens. So basically you need to know what to fix ie with this hardware and this constraint, what's the best result we can get? (this is also why karpathy chooses bits per byte instead of cross entropy loss, as you can change CE by changing the vocab size) It then seems like everything else is a degree of freedom but really you've actually fixed the hardest part of research: not the steps, but the eval. 98% of good research is coming up with the right questions to ask in the first place. Of course, hillclimbing a metric/eval is useful, but autoresearch to me in its current form is a more general hyperparameter optimiser, where you kind of implicitly define hyperparameters that include things like architecture and design decisions, not just what you can specify a grid search over in ints or floats. On this point, the way I personally use these sort of loops is to have a running list of ideas I want to try or hypotheses to test. The former are optimising for a certain metric, and the latter are often trying to figure out the contours of the problem i'm working on. Models tend to degenerate/collapse to really small niche changes without intervention from a good human researcher, as even seen in Karpathy's stuff, and lack the creativity to continually drive new ideas forward based on previous results. It's like their value function is too vanilla I also think you basically need to constrain it to be single-file; the agent gets confused and creates a lot of mess if you don't do this. This is part of the reason why having truss train push (our Baseten training product) as a constraint, even though it seems trivially the same as just sshing into a node, is important. It creates focus for the agent. Finally, most people I know who have been taking advantage of LLMs in their work and research already run some sort of autoresearch loop and have been doing so for ages. Things tend to go viral when karpathy posts them, and he has figured out the minimal abstractions to run this, but I also think it needs to not be overhyped and interpreted in the context of previous prompt-based optimisation loops like GEPA
Andrej Karpathy@karpathy

I packaged up the "autoresearch" project into a new self-contained minimal repo if people would like to play over the weekend. It's basically nanochat LLM training core stripped down to a single-GPU, one file version of ~630 lines of code, then: - the human iterates on the prompt (.md) - the AI agent iterates on the training code (.py) The goal is to engineer your agents to make the fastest research progress indefinitely and without any of your own involvement. In the image, every dot is a complete LLM training run that lasts exactly 5 minutes. The agent works in an autonomous loop on a git feature branch and accumulates git commits to the training script as it finds better settings (of lower validation loss by the end) of the neural network architecture, the optimizer, all the hyperparameters, etc. You can imagine comparing the research progress of different prompts, different agents, etc. github.com/karpathy/autor… Part code, part sci-fi, and a pinch of psychosis :)

English
2
0
17
1.7K
Mike Bilodeau
Mike Bilodeau@mj_bilodeau·
the creativity required to get this type of gain is what has always made me love infra (and our infra team). end-users of applications will never see it or interact with it, they'll just feel it when the products they love get better.
Rachel Rapp@rapprach

x.com/i/article/2034…

English
0
0
3
121