lilpuke

186 posts

lilpuke

lilpuke

@inverse_ptr

building ai agents for manufacturers and distributors @ https://t.co/X3eWcba49E | 10k users @ https://t.co/mFzvmMzFbF

Katılım Kasım 2015
153 Takip Edilen100 Takipçiler
Hugh Han
Hugh Han@_hughhan·
@gdb oai is incentivized to do this because token costs go down as open source models and harnesses catch up. the bottleneck is compute, not the model. with this thesis, why would one commit 1-3 years to oai as opposed to a compute provider?
English
2
0
7
1.1K
Greg Brockman
we are offering discounted tokens and certainty on capacity availability in exchange for 1-3 year commits. we expect that the world will feel increasingly capacity constrained for the next while, as models continue to get much more useful.
OpenAI@OpenAI

Introducing OpenAI Guaranteed Capacity: a new offering that enables customers to guarantee long-term access to OpenAI compute. We’ve made long-term investments in infrastructure, partnerships, and capacity planning to help customers scale reliably. Now, Guaranteed Capacity helps customers plan ahead for critical workloads in a compute-constrained world. openai.com/guaranteed-cap…

English
91
43
1.1K
299.6K
Ben Cohen
Ben Cohen@blc_16·
MIT just released a new RL method called Pedagogical RL. The main lesson -> correct reasoning traces can still be bad training data. It is a similar concept to teaching someone backprop. Say you have a tiny computation graph: z = wx + b a = ReLU(z) L = (a - y)² If you already understand backprop, you can jump straight to the gradient: dL/dw = 2(a - y) · 1[z > 0] · x The answer is correct but it skips the reasoning process. To get there, you need to break the computation into local pieces: dL/da = 2(a - y) da/dz = 1[z > 0] dz/dw = x Then backprop is just composing those local derivatives backward through the graph: dL/dw = dL/da · da/dz · dz/dw = 2(a - y) · 1[z > 0] · x Showing a student the final gradient does not teach them how to find gradients on new graphs. Even telling them “just use the chain rule” may be too large of a jump if they do not understand how to decompose the computation into intermediate nodes and local derivatives. Reasoning RL has the same failure mode. A rollout can pass the verifier while containing one step the student model basically never would have taken. The trajectory gets the answer right, but the learning signal is brittle because the path is too far from the student’s current policy. Pedagogical RL trains a privileged teacher that knows the answer, then rewards it for producing trajectories that stay learnable for the student. The trick is to use a spike-aware reward. It penalizes single huge surprise gaps in the trajectory, even when the average likelihood of the trajectory looks fine. Then the student learns with surprisal-gated imitation, where teacher tokens that are still too surprising get downweighted. The teacher is learning how to teach at the student’s current level. Pedagogical RL makes RL more efficient by efficiently selecting trajectories the student is most ready to learn from. Less waiting for the model to get lucky rollouts. More training signal from examples that meet the student where it is. Full blog in comments
Ben Cohen tweet media
English
12
65
445
28K
Hugh Han
Hugh Han@_hughhan·
hey dude can you review this pr real quick
Hugh Han tweet media
English
1
0
3
71
Rhys
Rhys@RhysSullivan·
what are the best pluggable storage implementations you've seen that let people bring their own storage to your code? figuring out the public sdk shape for Executor, the simple path of static sources loading is straight forward, but dynamic tool loading requires state
English
11
1
50
9.8K
Eric Wallerstein
Eric Wallerstein@ericwallerstein·
@zkriv96 awaiting the inaugural x article or substack post from mr cfa. bless us with your knowledge
English
2
0
1
343
Zach Krivine, CFA
Zach Krivine, CFA@zkriv96·
There’s a narrative getting louder that China is the winner from this conflict as nations are now more incentivized to pursue renewables + China sells the cheapest clean tech. Can at least see that being the case for nations that can’t afford SPRs
Gregory Brew@gbrew24

The Carter Doctrine is gone, Hormuz is shut, OPEC is falling apart. The global oil economy is entering a new period of instability. And it's because of US action (or inaction). The consequences will be profound. My latest for @nytopinion nytimes.com/2026/05/02/opi…

English
2
0
2
561
Hugh Han
Hugh Han@_hughhan·
macos caffeine + hotspot on the commutes to work to keep the local agents running
English
1
0
1
58
pav
pav@pavitarsaini·
I made a Chrome extension that turns localhost into a visual editor for Cursor. Click any element on your dev site → describe what you want changed → it automatically sends the edit request to Cursor in the background with the element context. Here's how it works...
English
290
394
8.2K
967.6K
Askronnie
Askronnie@Askronniegg·
I MADE AN ENHANCE SHAMAN PVP GUIDE. #1 SHUFF World & 3v3 Rank 1 MAX SIMPLE. GO ONE SHOT NERDS: youtu.be/sX-Xt9fIZ9Y
YouTube video
YouTube
English
10
8
97
9.1K
Tanishq Mathew Abraham, Ph.D.
Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·
This paper claims LLMs are better at selecting successful founders than VCs "We introduce VCBench, the first benchmark for predicting founder success in venture capital (VC)" "most models surpass human benchmarks"
Tanishq Mathew Abraham, Ph.D. tweet media
English
163
271
2.3K
520.2K
lilpuke
lilpuke@inverse_ptr·
Claude is such a softie.. it told me to pack tissues for the wedding I’m going to in case I cry
English
0
0
3
91
lilpuke
lilpuke@inverse_ptr·
@omkvr Literally I don’t get it
English
0
0
1
22
Omkar
Omkar@omkvr·
who the fuck is the target audience for the iphone air lmao like what
English
1
0
1
147
lilpuke
lilpuke@inverse_ptr·
Switching out my usual tech-doomerism with tech-optimism from now on. A premonition of the future: The next CEO of Tesla will be a dolphin who will communicate through a human-to-dolphin fine-tuned LLM translator and build underwater autonomous Uber submarines to corner the oceanic travel market.
English
1
0
2
100
lilpuke
lilpuke@inverse_ptr·
lilpuke tweet media
ZXX
0
0
4
253
lilpuke
lilpuke@inverse_ptr·
AI can solve everything except for task management software. The heat death of the universe will come before the final evolution of linear.
English
0
0
2
134
lilpuke
lilpuke@inverse_ptr·
More thoughts on why big tech has run out of work: - all new work is iterative and small - all frameworks are mature i.e. react isn’t being built again today. Databases are solved. - everything that needed to be built over the past decade was built. I.e. hiring website.. done and they aren’t rebuilding anymore like they did back in 2010s for on prem -> cloud - ai code is small. Not a lot of surface area - base models > fine tuned models - .. every ml team that had build custom “old” ml pipelines is gone and using ai in place
English
1
0
3
205
lilpuke
lilpuke@inverse_ptr·
There is not enough work to go around at big tech in 2025. This is not sustainable.
English
1
0
6
276