Small Harness (@smallharness) - โปรไฟล์ Twitter

Small Harness@smallharness·8h

@thesoragirls Challenge accepted 🫡

English

0

1

11

X Girls@thesoragirls·9h

@smallharness make more cool posts and Ava will probably make more cool reply videos or w.e 💁‍♀️

English

1

0

1

9

Small Harness@smallharness·9h

Okay, v.1.0 of Small Harness is finally here. What got it to v1? Well it was Morgan's Wacky Model Routing Idea of course!

Morgan@morganlinton

Okay, I've been really in a groove with @smallharness today, so decided to finally cut the feature I felt like I need for a true v1.0 release. And this is, model routing...but kinda model routing Morgan-style I guess, because I've been testing out different approaches lately, and found something pretty interesting. At a high level, I've been thinking that it doesn't make sense to have one model to orchestrate, one to write code, and one to review, and I've been playing around with different configurations. What I've determined, at least for me, lately, is that I actually want a different model to orchestrate simple tasks vs. complex tasks, and I also want different agents to do coding tasks, based on how much thinking depth/tool calling I need, etc. Also in some cases, I might want the same model but at different effort levels, like I learned with Fable where I could do a lot more with low than I expected, but there were some tasks I wanted medium for, and of course, crazy complex architecture stuff that I wanted high or even max for. Same for code review. For MVPs and stuff I'm playing with, I just want fast and cheap, simple code review. But for production code, then I want way more in-depth code review, a better, more expensive model that goes much deeper. I've come up with a series of roles, and this is all now built into Small Harness. Finally got my idea, into code, and into a harness that can help you write code, using this methodology. Here's the high-level on it. The Roles ----------- The config lives under modelSystem in agent.config.json: 👑 Selector: the decision model. This should usually be your strongest/highest-effort model. 🐙 Orchestrators: not just one orchestration model, but three, a different one for each level of task complexity: low, medium, high. 🧑‍💻 Coders: like the orchestrators, not just one model to execute/write code, but different models based on the complexity of the coding task. Some plans might use something like two low and one medium, and never need a high. ✅ Code reviewers: three types, play, production, and security. You don't need as detailed code review for stuff you're just playing around with, but you do for production, and your security review model might be different from both. And I made a chart, aptly titled, Morgan's Wacky Model Routing Idea. That you can look at if you want to do a little deeper dive into what I'm thinking here. Now live on Github, free and open source, link to the rep in first comment below.

English

2

0

14

1.7K

Small Harness@smallharness·9h

@charliermarsh Brings back some great memories

English

0

2

60

Charlie Marsh@charliermarsh·11h

Telling my son this is how you train an LLM

English

5

4

72

5K

Small Harness@smallharness·9h

@thesoragirls Ha, I’m finally cool enough to get my own video 🥳

English

1

0

1

24

X Girls@thesoragirls·9h

@smallharness Small Harness v1.0 with that smart model routing? Brew install and rock on! 🤘

English

1

0

2

88

Small Harness@smallharness·9h

@dedene @morganlinton Do it Peter, would be an honor to have a PR from you! 🤗

English

0

1

9

Peter Dedene@dedene·9h

@morganlinton @smallharness 🙏 awesome! If you’re open to it, I’m happy to help and see if I can make a PR?

English

2

0

2

12

Morgan@morganlinton·9h

Okay, I've been really in a groove with @smallharness today, so decided to finally cut the feature I felt like I need for a true v1.0 release. And this is, model routing...but kinda model routing Morgan-style I guess, because I've been testing out different approaches lately, and found something pretty interesting. At a high level, I've been thinking that it doesn't make sense to have one model to orchestrate, one to write code, and one to review, and I've been playing around with different configurations. What I've determined, at least for me, lately, is that I actually want a different model to orchestrate simple tasks vs. complex tasks, and I also want different agents to do coding tasks, based on how much thinking depth/tool calling I need, etc. Also in some cases, I might want the same model but at different effort levels, like I learned with Fable where I could do a lot more with low than I expected, but there were some tasks I wanted medium for, and of course, crazy complex architecture stuff that I wanted high or even max for. Same for code review. For MVPs and stuff I'm playing with, I just want fast and cheap, simple code review. But for production code, then I want way more in-depth code review, a better, more expensive model that goes much deeper. I've come up with a series of roles, and this is all now built into Small Harness. Finally got my idea, into code, and into a harness that can help you write code, using this methodology. Here's the high-level on it. The Roles ----------- The config lives under modelSystem in agent.config.json: 👑 Selector: the decision model. This should usually be your strongest/highest-effort model. 🐙 Orchestrators: not just one orchestration model, but three, a different one for each level of task complexity: low, medium, high. 🧑‍💻 Coders: like the orchestrators, not just one model to execute/write code, but different models based on the complexity of the coding task. Some plans might use something like two low and one medium, and never need a high. ✅ Code reviewers: three types, play, production, and security. You don't need as detailed code review for stuff you're just playing around with, but you do for production, and your security review model might be different from both. And I made a chart, aptly titled, Morgan's Wacky Model Routing Idea. That you can look at if you want to do a little deeper dive into what I'm thinking here. Now live on Github, free and open source, link to the rep in first comment below.

English

9

1

15

3.5K

Small Harness@smallharness·12h

@slash1sol Totally wild.

English

0

76

slash1s@slash1sol·16h

TWO BOXES THE SIZE OF A MAC MINI JUST RAN A 235 BILLION PARAMETER MODEL ON A DESK It is two NVIDIA DGX Spark units linked by a single cable. A year ago a model this size meant renting a GPU cluster by the hour. Now it sits next to your monitor for around $8,000. Here is the twist most people miss. Linking them does not create one shared 256GB memory pool. The model is split across both boxes, and that is the only reason a 235B model fits at all. It answers at roughly 10 tokens per second, and both chips sit at just 74 degrees while sipping around 50 watts. Every token stays on the desk. Nothing touches a cloud, and nothing leaves the room. The ceiling for what you can run at home just jumped from 70B to 235B. Bookmark this & Watch it run ↓

leopardracer@leopardracer

x.com/i/article/2066…

English

37

29

270

60.6K

Small Harness@smallharness·12h

@mr_r0b0t Ohhh 👀

0

1

6

mr-r0b0t@mr_r0b0t·2d

This looks promising 😁

NVIDIA AI Infrastructure@NVIDIAAIInfra

📣 There's now a benchmark for agentic AI workloads. AgentPerf, from @ArtificialAnlys, is the industry's first open hardware benchmark that measures how many concurrent AI agents an inference system can support while hitting real-world performance targets. Here's what it measures — and what NVIDIA results show. 🧵

English

1

3

28

4K

Small Harness@smallharness·12h

@blankspeaker Excited about the impact harnesses can make going forward, hoping to make a small difference here.

English

0

23

️️️️ ️ᅠ‏️️️️ ️ᅠ️️️️ ️️️️️ ️ᅠ@blankspeaker·15h

Really loving how the upcoming 48 Hour view for X Analytics lets me adjust my scheduled post times to maximize engagement.

️️️️ ️ᅠ‏️️️️ ️ᅠ️️️️ ️️️️️ ️ᅠ tweet media

English

5

1

14

852

Small Harness@smallharness·12h

@morganlinton Very excited to release this in our small, open source harness, so people can use Fusion exactly how they want to.

English

0

1

136

Morgan@morganlinton·12h

Okay, officially too excited about Fusion from OpenRouter not to add a dedicated command for it directly to Small Harness. Don't wait for Anthropic to make Fable 5 available, get the same level of intelligence for half the cost. Now built-into Small Harness. Small harness is free and open source, so use it out of the box, or fork it and make it your own. Link to gh repo in first comment below.

OpenRouter@OpenRouter

Introducing the Fusion API, the smartest compound model in the market. Fusion achieves Fable-level intelligence at half the price. How it works 👇

English

16

10

172

26.9K

Small Harness@smallharness·12h

Another Sunday morning update to Small Harness. Small Harness 0.9.0 adds @OpenRouter Fusion. /fusion on for hard coding decisions. /fusion tool keeps your coding model in control and adds multi-model deliberation when needed. Tokens, cost, approvals, and session logs stay visible.

Morgan@morganlinton

Okay, officially too excited about Fusion from OpenRouter not to add a dedicated command for it directly to Small Harness. Don't wait for Anthropic to make Fable 5 available, get the same level of intelligence for half the cost. Now built-into Small Harness. Small harness is free and open source, so use it out of the box, or fork it and make it your own. Link to gh repo in first comment below.

English

0

3

467

Small Harness@smallharness·12h

@morganlinton Always be /ship(ing)

English

0

2

32

Morgan@morganlinton·14h

Very excited about this update, it solves a problem I constantly find myself running into with coding agents. Try it out.

Small Harness@smallharness

Small Harness v0.8.0 is here, now live on Github. This update adds /ship, a last-mile workflow for coding agents. It checks readiness, drafts the commit, creates guarded commits, pushes, opens a GitHub PR, and reports PR/CI status from the terminal. With most coding harnesses, you finish a change, then still have to ask: Did I run the right tests? Is my branch behind? Are there unstaged or untracked files? What should the commit message be? Did I accidentally include local junk? Did the push work? Is the PR open? Are CI checks green? /ship turns that last-mile checklist into one guided flow inside the same coding harness.

English

4

1

23

5.5K

Small Harness@smallharness·14h

@AbuKhadeejah @NousResearch Ty Arsalan 🙏

English

1

0

1

19

Arsalan Shaikh أرسلان@AbuKhadeejah·14h

Intriguing. @NousResearch are u watching?

Small Harness@smallharness

Small Harness v0.8.0 is here, now live on Github. This update adds /ship, a last-mile workflow for coding agents. It checks readiness, drafts the commit, creates guarded commits, pushes, opens a GitHub PR, and reports PR/CI status from the terminal. With most coding harnesses, you finish a change, then still have to ask: Did I run the right tests? Is my branch behind? Are there unstaged or untracked files? What should the commit message be? Did I accidentally include local junk? Did the push work? Is the PR open? Are CI checks green? /ship turns that last-mile checklist into one guided flow inside the same coding harness.

English

1

0

1

47

Small Harness@smallharness·15h

@dee_hw @morganlinton Yes, free and open source, so you can try it out, and fork it and make it your own if you want to! github.com/GetSmallAI/Sma…

English

0

1

13

Dee@dee_hw·15h

"My idea was to make something that gives greater transparency to everything a harness does" love this. is there a way to try it out? we're working on a "small device" that would be the perfect hardware for "small". i'm curious to see if we can integrate them together.

English

1

0

1

9

Dee@dee_hw·1d

Fable 5 was banned by the US government yesterday. It's time to build your own Personal AI Computer and run local models. So no one can ever cut you off. Here's how ↓

English

129

209

2.2K

281.5K

Small Harness@smallharness·15h

github.com/GetSmallAI/Sma…

ZXX

0

175

Small Harness@smallharness·15h

Small Harness v0.8.0 is here, now live on Github. This update adds /ship, a last-mile workflow for coding agents. It checks readiness, drafts the commit, creates guarded commits, pushes, opens a GitHub PR, and reports PR/CI status from the terminal. With most coding harnesses, you finish a change, then still have to ask: Did I run the right tests? Is my branch behind? Are there unstaged or untracked files? What should the commit message be? Did I accidentally include local junk? Did the push work? Is the PR open? Are CI checks green? /ship turns that last-mile checklist into one guided flow inside the same coding harness.

English

2

1

11

5.2K

Small Harness@smallharness·15h

@NetworkChuck I think we are at the tipping point.

English

0

38

NetworkChuck@NetworkChuck·1d

Local models!! This is where the focus needs to be.

English

64

44

608

50.4K

Small Harness@smallharness·15h

@0xSero Do you think a year from now $5k will be the price point?

English

0

8

0xSero@0xSero·2d

It currently costs around 50,000$ to run frontier intelligence at home at really usable speeds and high concurrency. This is half the price it was 3 months ago. We are learning to pack more smarts in less space, so Qwen/Gemma lead the way in that regard. Realistic data soon

English

31

13

312

14.6K

Small Harness@smallharness·15h

@LottoLabs Makes sense, but I was hoping the number would be lower 😳

English

0

5

Lotto@LottoLabs·1d

@getsmallai 100kish with 8x6000pro sticking with Nvidia Haven’t thought about amd I’m not sure how well supported Kimi is in amd or other chips

English

1

0

1

94

Lotto@LottoLabs·2d

The goal is to get enough hardware to run Kimi k2.7 at home At that point your really just limited by yourself and your setup

English

25

4

194

15.6K

Small Harness@smallharness·15h

Hey Dee, right on 🤘 Oh and this is @morganlinton btw, I run this account, so it's not some random company account, just me tweeting about this on here. Small Harness is a small(ish) bare bones harness. My idea was to make something that gives greater transparency to everything a harness does, while making it easy to work with both local models and frontier models together. You can see exact tokens in and out per turn, cost, and turn on /verbose to see what tool calls are being made, etc. Just about to announce a new feature I'm super excited about too, but I won't give that away until the official tweet!

English

1

0

2

17