Small Model Lab

4.4K posts

Small Model Lab

Small Model Lab

@SmallModelLab

Building small but mighty special purpose LLMs. Bigger isn’t always better.

Joined Ocak 2009
339 Following4K Followers
Bindu Reddy
Bindu Reddy@bindureddy·
Gemma 4 is a very good small model that punches above it's weight class Gemma is a 31B model that is as good as other very large MoE models It's the best in the world for it's size 👏👏
Bindu Reddy tweet media
English
37
25
374
44.4K
Garry Tan
Garry Tan@garrytan·
Now launching GStack Browse Before, /browse actions are by default headless, but now you can just ask for headed mode and you get a real steerable browser. The sidebar is an interactive Claude Code session that lets you navigate, run operations, and is an open source customizable version of what Comet or Atlas Browsers give you It's connected to both the sidebar AND your origin Claude Code instance so it's useful for things like page debugging and CSS interaction Try it now: run /gstack-upgrade and then /open-gstack-browser
Garry Tan tweet media
English
85
53
1.1K
7.8M
0xSero
0xSero@0xSero·
This didn't receive the attention it deserved. They pre-trained this model completely peer 2 peer, no data-centers. Everything was done over a permissionless network, I have tried the model, it's honestly not a good LLM but that's beyond the point. We NEED this, we NEED an alternative. - Download OpenCode - Download Pi - Pay for OpenSource - Share your AI sessions - Learn to do RL We can't be at the mercy of ANY lab. arxiv.org/abs/2603.08163
0xSero tweet media
English
44
112
1K
50.6K
NIK
NIK@ns123abc·
🚨BREAKING: OpenAI has closed $122 billion funding round at $852 billion valuation THE LARGEST PRIVATE FUNDING ROUND IN HISTORY
NIK tweet mediaNIK tweet media
English
193
124
2.1K
146.6K
0xSero
0xSero@0xSero·
The first company to make AI boxes, with specialised AI models trained to fit on that hardware will be the next Apple. Would you buy? Should I start a company doing this?
0xSero tweet media
English
253
26
796
55.4K
Small Model Lab
Small Model Lab@SmallModelLab·
Super interesting data here, and amazing how long it takes to REAP these, but clearly worth it!
0xSero@0xSero

Qwen3.5, MiniMax-M2.7 are incredible acts of kindness that I don't think will be with us from so much longer. Here's my update for you. > I have 20 GPUs at full utilisation right now. All these getting cooooompressed, no synthetic data All runs will be done in 9 days, if I don't get a catastrophic failure - REAP for: - GLM-5 - Qwen3-next-coder - Qwen3.5-122B - Qwen3.5-plus-397b - Browser-use - CUDA - Terminal-use - Coding - Math - Agentic trajectories - 30% my personal chat session history I am also removing refusals inspired by Prism. So no more I can't do this I can't do that blah blah Inference for local AI - Qwen3.5-262B-REAP - I've been using it exclusively in Parchi, perfect 100 tokens/s & 0 errors very good at browser use ----------------- Secret - Qwen3.5-27b - you will see when i'm done Targeting the following hardware levels: With full context 200-256k context in vllm, sglang, llama.cpp, exllamav3, and if people help MLX 16-32 GB - Qwen3.5-27b 32-48 GB - Qwen3-coder-next 48-128 GB - Qwen3.5-122B 128-256 GB - Qwen3.5-Plus-397B 196-512 GB - GLM-5.* I am training them on 22,000 samples at 16k context 352M of custom selected calibration datasets. My hope is to make the highest quality multimodal LLM compressions for this year. 20 GPUs running in parallel for the next 10 days - 8x H100s - Qwen - 4x B200s - GLM-5.* - 8x 3090s - Testing Once MiniMax-M2.7 is online 4 more GPUs will get to work.

English
0
0
0
69
OpenAI
OpenAI@OpenAI·
Today, we closed our latest funding round with $122 billion in committed capital at an $852B post-money valuation. The fastest way to expand AI’s benefits is to put useful intelligence in people’s hands early and let access compound globally. This funding gives us resources to lead at scale. openai.com/index/accelera…
English
1.1K
747
8.4K
3.8M
Small Model Lab
Small Model Lab@SmallModelLab·
@0xSero Super interesting, thanks for sharing all the details, and I sure hope they’re with us for at least a bit longer!!
English
0
0
0
476
0xSero
0xSero@0xSero·
Qwen3.5, MiniMax-M2.7 are incredible acts of kindness that I don't think will be with us from so much longer. Here's my update for you. > I have 20 GPUs at full utilisation right now. All these getting cooooompressed, no synthetic data All runs will be done in 9 days, if I don't get a catastrophic failure - REAP for: - GLM-5 - Qwen3-next-coder - Qwen3.5-122B - Qwen3.5-plus-397b - Browser-use - CUDA - Terminal-use - Coding - Math - Agentic trajectories - 30% my personal chat session history I am also removing refusals inspired by Prism. So no more I can't do this I can't do that blah blah Inference for local AI - Qwen3.5-262B-REAP - I've been using it exclusively in Parchi, perfect 100 tokens/s & 0 errors very good at browser use ----------------- Secret - Qwen3.5-27b - you will see when i'm done Targeting the following hardware levels: With full context 200-256k context in vllm, sglang, llama.cpp, exllamav3, and if people help MLX 16-32 GB - Qwen3.5-27b 32-48 GB - Qwen3-coder-next 48-128 GB - Qwen3.5-122B 128-256 GB - Qwen3.5-Plus-397B 196-512 GB - GLM-5.* I am training them on 22,000 samples at 16k context 352M of custom selected calibration datasets. My hope is to make the highest quality multimodal LLM compressions for this year. 20 GPUs running in parallel for the next 10 days - 8x H100s - Qwen - 4x B200s - GLM-5.* - 8x 3090s - Testing Once MiniMax-M2.7 is online 4 more GPUs will get to work.
0xSero tweet media
English
35
19
665
25.2K
Morgan
Morgan@morganlinton·
I'm noodling on a few different ways to gauge code quality for LLMs without going through one of the standard benchmarks. What I want to do is try to compare different local LLMs code quality output vs. frontier models, for different tasks, starting with small tasks, and growing to larger ones. My thought process here is. There are a lot of really small easy coding tasks, that local LLMs can probably do just as well as frontier models. Then of course, there's a lot of harder tasks that local LLMs do a very mediocre job on and frontier models nail. Curious how I could possibly find the line so I can split tasks into buckets, i.e. fine for a local llm, and needs a frontier model. I also don't want to reinvent the wheel, and think probably someone like @0xSero or @theo has done something like this? Curious what people think and if there's already a good set of evals/benchmarks to run to test code quality specifically as complexity goes up?
English
8
0
12
1.9K
Quant Science
Quant Science@quantscience_·
Python is mind-boggling for finance. Case in point: There's a Finance database of 300,000 tickers. Available 100% for free:
Quant Science tweet media
English
8
88
707
37.6K
Morgan
Morgan@morganlinton·
I spent 10 minutes working on Thesium today. Codex spent over five hours, and moved a nice chunk of the backend from Python to Rust. Getting ready for bed and decided to run the benchmarks, and, as expected, Rust is fast 🔥
English
1
0
3
898
Morgan
Morgan@morganlinton·
Boom - successful overnight Codex run.
Morgan tweet media
English
3
0
12
975
Small Model Lab retweeted
PyQuant News 🐍
PyQuant News 🐍@pyquantnews·
Factor investing is what made me stand out at JPMorgan. But it took me years to master the information coefficient. In 1 minute, I'll teach you the 10 things you need to know (that took me 1 year to learn). Let's go:
PyQuant News 🐍 tweet media
English
7
34
295
31.6K
Kanika
Kanika@KanikaBK·
🤯 HOLY SHIT. I wasted WEEKS on deep research before discovering this. I don't get why most people don't use PERPLEXITY for DEEP RESEARCH. HERE are 10 prompts that turn it into a PhD-level research assistant (and save you weeks of work):
Kanika tweet media
English
11
6
77
6.4K