Lucas
231 posts

Lucas
@quantbagel
ml research, ex-quant 2x


My fellow researchers — an open question for everyone: what short-term research project grants do we know of that support scientific agendas aimed at progress on long-horizon problems, especially open-source work, and are accessible to researchers outside academia?









I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.


@quantbagel we modified the visual encoder somewhat similarly to add visual memory the clever bit in our approach is it takes variable frame counts but fixed output token count, and we preserve identical outputs to the siglip when there’s only one frame








