Jonas Eschmann
709 posts

Jonas Eschmann
@jonas_eschmann
PhD student @UCBerkeley Working on reinforcement learning for continuous control @rl_tools





Dynamic allocation / heap allocation is enemy number one. If your computer program is well designed, you should know how much resources it is going to take up before you run it. If you don't, then it isn't a good program Allocate everything on the stack

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below


Haha, geohot is tagging PRs with the line "ai slop" XD







Today I’m sharing a new research paper that explores a new idea in mixture of experts architecture called “DynaMoE”. DynaMoE is a Mixture-of-Experts framework where: - the number of active experts per token is dynamic. - the number of all experts can be scheduled differently across layers. From my findings the best model has a descending expert scheduler, where beginning layers have the most experts and the end layer have the least (1 expert). This removes the rigid Top-K routing used in most MoE models and improves parameter efficiency and training stability. Paper: arxiv.org/abs/2603.01697


What if C++ was used for machine learning instead of Python. given C++ evolved to allow that. I wonder how faster, our AI models would be these days.



wow Qwen3.5-27B score on Humanity's Last Exam 🚀













