Boundary

117 posts

Boundary banner
Boundary

Boundary

@boundaryML

build unbreakable agents with BAML 🧱. What TypeScript did for JS, BAML does for AI. 🧪 Try it: https://t.co/9bobLt1ObL

Katılım Ocak 2023
30 Takip Edilen1.4K Takipçiler
Boundary
Boundary@boundaryML·
marginally manageable is the highest form of flattery
fog bear the psycho dj@djpsychofogbear

@Jonathan_Blow 5/x: and then use @boundaryML 's BAML for chunking things out and testing / iterating quickly. fuck llms and prompts etc but if you have to work with them BAML is like the only thing that makes it feel marginally manageable.

English
0
0
0
224
Boundary retweetledi
Aaron Villalpando
Aaron Villalpando@aaronvi·
We are launching a programming language built for agents soon called BAML that has been in the making for 1+ years. You can follow @boundaryML We are a small team of 6 developing it with care, gathering feedback from humans and agents. Fully Open Source. If you are interested DM me for early alpha access.
Chris Tate@ctatedev

Introducing Zero The programming language for agents. I wanted a systems language that was faster, smaller, and easier for agents to use and repair. Explicit capabilities. JSON diagnostics. Typed safe fixes. Made for agents on day zero.

English
2
2
13
1.4K
Boundary retweetledi
Isaac Kargar
Isaac Kargar@kargarisaac·
I benchmarked a new extraction harness on a private eval dataset for lerim-cli (new version is out now - v0.1.83) and the main lesson was very clear: if you want smaller models to work well, you should stop asking the model to do everything and start doing more engineering work. Before, the agent was closer to a single-pass PydanticAI setup: read a large trace, understand what matters, decide what is durable memory, call tools correctly, stay inside the context window, and output clean structured records. That puts too much burden on the model, especially when you want to use smaller or cheaper models. The new harness is BAML (@boundaryML) + LangGraph (@LangChain). The graph now does more of the deterministic work: - read the trace in windows - ask the model to scan one window at a time - keep compact findings instead of the whole trace - synthesize memory records only at the end - validate/retry typed BAML outputs - persist with normal code, not model improvisation So the model is not the whole agent anymore -> It is one reasoning component inside a more engineered system. On the private benchmark, using the same MiniMax M2.7 model, the new harness completed all cases while the old harness had multiple failures from tool retries and context window issues. - Task completion: BAML+LangGraph completed 100.0% vs PydanticAI at 72.73%, a +27.27 point lead. - Case failures: BAML+LangGraph had 0 failures vs PydanticAI with 6, meaning 6 fewer failures. - Episode count rate: BAML+LangGraph reached 100.0% vs PydanticAI at 81.25%, a +18.75 point lead. - Record budget rate: BAML+LangGraph reached 46.88% vs PydanticAI at 28.12%, a +18.76 point lead. - Concept recall average: BAML+LangGraph scored 0.428 vs PydanticAI at 0.2598, a +0.1682 improvement. - Quality average: BAML+LangGraph scored 0.3352 vs PydanticAI at 0.318, a +0.0172 improvement. - Tool call errors average: BAML+LangGraph had 0.0625 vs PydanticAI at 1.9688, much better. Quality is not solved yet. It is only slightly better overall and still needs better pruning before persistence. But robustness improved a lot. This is the direction I think specialized agents should go: smaller models, more deterministic scaffolding, less magical thinking about one giant prompt doing the whole job. Next step is to make this work well with models people can run locally. A new version of Lerim-cli is now released with the extract agent refactored to use Langgraph+BAML. Next agents will be refactored as well soon in the next releases. github.com/lerim-dev/leri…
Isaac Kargar tweet media
English
1
1
5
294
Boundary retweetledi
Anish Palakurthi
Anish Palakurthi@anishpalakurT·
Announcing the BAML Bounty... For all power-users of BAML, we're giving away free BAML merch! (t-shirts, stickers, hoodies 🔥🧯). Share what you built with BAML with #baml → Fill out tally.so/r/PdErze → Free merch! Hurry! Supplies are limited to the first 50 posts.
Anish Palakurthi tweet media
English
0
5
6
859
Boundary
Boundary@boundaryML·
Whats your favorite language feature and why is it match?
English
5
0
1
324
Boundary retweetledi
Anish Palakurthi
Anish Palakurthi@anishpalakurT·
Looking for designers who have Blender experience! Will pay ~$500 for a single asset
English
49
2
61
3.2K
Boundary
Boundary@boundaryML·
RT @anishpalakurT: Hiring an absolutely cracked video editor for something big @boundaryML... DM me and follow so I see it! 👇
English
0
1
0
170
Boundary retweetledi
Anish Palakurthi
Anish Palakurthi@anishpalakurT·
If you read papers like this for fun you'll fit right in @boundaryML Join us, we're growing
Anish Palakurthi tweet media
English
0
1
7
459
Boundary retweetledi
Vaibhav Gupta
Vaibhav Gupta@vaibcode·
syntax makes a huge difference to how good coding agents are, and languages should be rethought. some great learnings from rust and go! great post by @aaronvi
Vaibhav Gupta tweet media
English
2
1
10
1.3K
Boundary
Boundary@boundaryML·
@cursor_ai 30% agent-written PRs is crazy. We're betting on the same future of self-driving codebases. Gardens you tend > repos you commit to. As agents write more, we think they need a more cooperative language. That's why we're making one 🔥 github.com/BoundaryML/baml
English
0
0
0
129
Cursor
Cursor@cursor_ai·
Cursor now shows you demos, not diffs. Agents can use the software they build and send you videos of their work.
English
402
597
8.4K
4.1M
Boundary
Boundary@boundaryML·
Hot take: building agents without BAML in 2026 is like training models without Jupyter notebooks in 2020. Technically possible? Sure. But why would you torture yourself? Type-safe prompts. Instant playground testing. Multi-language support. promptfiddle.com
Boundary tweet media
English
0
0
7
479
Boundary
Boundary@boundaryML·
@mattpocockuk AI in production is hard! #2, #4, and #6 resonate. It changed our brains too, so we built a programming language ;)
English
0
0
1
48
Matt Pocock
Matt Pocock@mattpocockuk·
I have been at 100% AI-contributed code for a few months now. Here are 9 ways it's changed my brain: 1. WAY more time thinking about integration testing 2. Friction via pre-commit hooks/CI/strong types is now super desirable 3. AI has no taste for UI, prototype extremely aggressively before committing to a PRD 4. AI has no taste for software architecture, be extremely explicit about the modules you want and think about their interfaces 5. Deep, grey-box modules with simple interfaces are the KING - let AI control what's in the box to decrease your cognitive load 6. Huge reliance on Effect.ts for dependency injection and strongly typed errors 7. Much more meta-programming, turning my skills into repeatable SOP's 8. First I thought MOAR DOCS = BETTER but 'doc rot' is real. Better to just let the AI generate JIT docs during plan mode. 9. Much higher cognitive load to keep up with the changes the LLM makes to the codebase
English
72
63
1.1K
69.3K
Boundary
Boundary@boundaryML·
@StriftCodes @vercel thanks for this! if you're ok with this, we may point our docs to this as well!
English
3
0
1
43