Small Harness

494 posts

Small Harness banner
Small Harness

Small Harness

@smallharness

An open source coding harness where local and frontier models jam together. Just 'brew install small-harness' and rock on 🤘 Created by @morganlinton.

Katılım Nisan 2024
27 Takip Edilen536 Takipçiler
X Girls
X Girls@thesoragirls·
@smallharness make more cool posts and Ava will probably make more cool reply videos or w.e 💁‍♀️
English
1
0
1
9
Small Harness
Small Harness@smallharness·
Okay, v.1.0 of Small Harness is finally here. What got it to v1? Well it was Morgan's Wacky Model Routing Idea of course!
Small Harness tweet media
Morgan@morganlinton

Okay, I've been really in a groove with @smallharness today, so decided to finally cut the feature I felt like I need for a true v1.0 release. And this is, model routing...but kinda model routing Morgan-style I guess, because I've been testing out different approaches lately, and found something pretty interesting. At a high level, I've been thinking that it doesn't make sense to have one model to orchestrate, one to write code, and one to review, and I've been playing around with different configurations. What I've determined, at least for me, lately, is that I actually want a different model to orchestrate simple tasks vs. complex tasks, and I also want different agents to do coding tasks, based on how much thinking depth/tool calling I need, etc. Also in some cases, I might want the same model but at different effort levels, like I learned with Fable where I could do a lot more with low than I expected, but there were some tasks I wanted medium for, and of course, crazy complex architecture stuff that I wanted high or even max for. Same for code review. For MVPs and stuff I'm playing with, I just want fast and cheap, simple code review. But for production code, then I want way more in-depth code review, a better, more expensive model that goes much deeper. I've come up with a series of roles, and this is all now built into Small Harness. Finally got my idea, into code, and into a harness that can help you write code, using this methodology. Here's the high-level on it. The Roles ----------- The config lives under modelSystem in agent.config.json: 👑 Selector: the decision model. This should usually be your strongest/highest-effort model. 🐙 Orchestrators: not just one orchestration model, but three, a different one for each level of task complexity: low, medium, high. 🧑‍💻 Coders: like the orchestrators, not just one model to execute/write code, but different models based on the complexity of the coding task. Some plans might use something like two low and one medium, and never need a high. ✅ Code reviewers: three types, play, production, and security. You don't need as detailed code review for stuff you're just playing around with, but you do for production, and your security review model might be different from both. And I made a chart, aptly titled, Morgan's Wacky Model Routing Idea. That you can look at if you want to do a little deeper dive into what I'm thinking here. Now live on Github, free and open source, link to the rep in first comment below.

English
2
0
14
1.6K
Charlie Marsh
Charlie Marsh@charliermarsh·
Telling my son this is how you train an LLM
Charlie Marsh tweet media
English
5
4
71
4.9K
X Girls
X Girls@thesoragirls·
@smallharness Small Harness v1.0 with that smart model routing? Brew install and rock on! 🤘
English
1
0
2
88
Morgan
Morgan@morganlinton·
Okay, I've been really in a groove with @smallharness today, so decided to finally cut the feature I felt like I need for a true v1.0 release. And this is, model routing...but kinda model routing Morgan-style I guess, because I've been testing out different approaches lately, and found something pretty interesting. At a high level, I've been thinking that it doesn't make sense to have one model to orchestrate, one to write code, and one to review, and I've been playing around with different configurations. What I've determined, at least for me, lately, is that I actually want a different model to orchestrate simple tasks vs. complex tasks, and I also want different agents to do coding tasks, based on how much thinking depth/tool calling I need, etc. Also in some cases, I might want the same model but at different effort levels, like I learned with Fable where I could do a lot more with low than I expected, but there were some tasks I wanted medium for, and of course, crazy complex architecture stuff that I wanted high or even max for. Same for code review. For MVPs and stuff I'm playing with, I just want fast and cheap, simple code review. But for production code, then I want way more in-depth code review, a better, more expensive model that goes much deeper. I've come up with a series of roles, and this is all now built into Small Harness. Finally got my idea, into code, and into a harness that can help you write code, using this methodology. Here's the high-level on it. The Roles ----------- The config lives under modelSystem in agent.config.json: 👑 Selector: the decision model. This should usually be your strongest/highest-effort model. 🐙 Orchestrators: not just one orchestration model, but three, a different one for each level of task complexity: low, medium, high. 🧑‍💻 Coders: like the orchestrators, not just one model to execute/write code, but different models based on the complexity of the coding task. Some plans might use something like two low and one medium, and never need a high. ✅ Code reviewers: three types, play, production, and security. You don't need as detailed code review for stuff you're just playing around with, but you do for production, and your security review model might be different from both. And I made a chart, aptly titled, Morgan's Wacky Model Routing Idea. That you can look at if you want to do a little deeper dive into what I'm thinking here. Now live on Github, free and open source, link to the rep in first comment below.
Morgan tweet mediaMorgan tweet media
English
8
1
15
3.4K
slash1s
slash1s@slash1sol·
TWO BOXES THE SIZE OF A MAC MINI JUST RAN A 235 BILLION PARAMETER MODEL ON A DESK It is two NVIDIA DGX Spark units linked by a single cable. A year ago a model this size meant renting a GPU cluster by the hour. Now it sits next to your monitor for around $8,000. Here is the twist most people miss. Linking them does not create one shared 256GB memory pool. The model is split across both boxes, and that is the only reason a 235B model fits at all. It answers at roughly 10 tokens per second, and both chips sit at just 74 degrees while sipping around 50 watts. Every token stays on the desk. Nothing touches a cloud, and nothing leaves the room. The ceiling for what you can run at home just jumped from 70B to 235B. Bookmark this & Watch it run ↓
leopardracer@leopardracer

x.com/i/article/2066…

English
36
28
256
57.5K
Small Harness
Small Harness@smallharness·
@blankspeaker Excited about the impact harnesses can make going forward, hoping to make a small difference here.
English
0
0
0
21
Small Harness
Small Harness@smallharness·
@morganlinton Very excited to release this in our small, open source harness, so people can use Fusion exactly how they want to.
English
0
0
1
135
Morgan
Morgan@morganlinton·
Okay, officially too excited about Fusion from OpenRouter not to add a dedicated command for it directly to Small Harness. Don't wait for Anthropic to make Fable 5 available, get the same level of intelligence for half the cost. Now built-into Small Harness. Small harness is free and open source, so use it out of the box, or fork it and make it your own. Link to gh repo in first comment below.
Morgan tweet media
OpenRouter@OpenRouter

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

English
16
10
165
25.8K
Small Harness
Small Harness@smallharness·
Another Sunday morning update to Small Harness. Small Harness 0.9.0 adds @OpenRouter Fusion. /fusion on for hard coding decisions. /fusion tool keeps your coding model in control and adds multi-model deliberation when needed. Tokens, cost, approvals, and session logs stay visible.
Morgan@morganlinton

Okay, officially too excited about Fusion from OpenRouter not to add a dedicated command for it directly to Small Harness. Don't wait for Anthropic to make Fable 5 available, get the same level of intelligence for half the cost. Now built-into Small Harness. Small harness is free and open source, so use it out of the box, or fork it and make it your own. Link to gh repo in first comment below.

English
0
0
3
443
Dee
Dee@dee_hw·
"My idea was to make something that gives greater transparency to everything a harness does" love this. is there a way to try it out? we're working on a "small device" that would be the perfect hardware for "small". i'm curious to see if we can integrate them together.
English
1
0
1
9
Dee
Dee@dee_hw·
Fable 5 was banned by the US government yesterday. It's time to build your own Personal AI Computer and run local models. So no one can ever cut you off. Here's how ↓
English
129
208
2.2K
280.9K
Small Harness
Small Harness@smallharness·
Small Harness v0.8.0 is here, now live on Github. This update adds /ship, a last-mile workflow for coding agents. It checks readiness, drafts the commit, creates guarded commits, pushes, opens a GitHub PR, and reports PR/CI status from the terminal. With most coding harnesses, you finish a change, then still have to ask: Did I run the right tests? Is my branch behind? Are there unstaged or untracked files? What should the commit message be? Did I accidentally include local junk? Did the push work? Is the PR open? Are CI checks green? /ship turns that last-mile checklist into one guided flow inside the same coding harness.
Small Harness tweet media
English
2
1
11
5.1K
NetworkChuck
NetworkChuck@NetworkChuck·
Local models!! This is where the focus needs to be.
English
64
44
608
49.3K
Small Harness
Small Harness@smallharness·
@0xSero Do you think a year from now $5k will be the price point?
English
0
0
0
8
0xSero
0xSero@0xSero·
It currently costs around 50,000$ to run frontier intelligence at home at really usable speeds and high concurrency. This is half the price it was 3 months ago. We are learning to pack more smarts in less space, so Qwen/Gemma lead the way in that regard. Realistic data soon
0xSero tweet media
English
31
13
312
14.6K
Small Harness
Small Harness@smallharness·
@LottoLabs Makes sense, but I was hoping the number would be lower 😳
English
0
0
0
5
Lotto
Lotto@LottoLabs·
@getsmallai 100kish with 8x6000pro sticking with Nvidia Haven’t thought about amd I’m not sure how well supported Kimi is in amd or other chips
English
1
0
1
94
Lotto
Lotto@LottoLabs·
The goal is to get enough hardware to run Kimi k2.7 at home At that point your really just limited by yourself and your setup
English
25
4
194
15.5K
Small Harness
Small Harness@smallharness·
Hey Dee, right on 🤘 Oh and this is @morganlinton btw, I run this account, so it's not some random company account, just me tweeting about this on here. Small Harness is a small(ish) bare bones harness. My idea was to make something that gives greater transparency to everything a harness does, while making it easy to work with both local models and frontier models together. You can see exact tokens in and out per turn, cost, and turn on /verbose to see what tool calls are being made, etc. Just about to announce a new feature I'm super excited about too, but I won't give that away until the official tweet!
English
1
0
2
17
Dee
Dee@dee_hw·
@getsmallai What's Small? Would love to learn more.
English
1
0
0
125