Sol Irvine

8.2K posts

Sol Irvine banner
Sol Irvine

Sol Irvine

@solirvine

mostly harmless • building https://t.co/5W0eJiJlRS and https://t.co/43ZMkKkMRa

kyoto, japan Katılım Mayıs 2010
1.4K Takip Edilen922 Takipçiler
Sabitlenmiş Tweet
Sol Irvine
Sol Irvine@solirvine·
東本願寺
Sol Irvine tweet media
日本語
0
0
3
1.4K
Sol Irvine
Sol Irvine@solirvine·
Anthropic and OpenAI already built all these features for coders. Their only remaining obstacle is porting those features to the (horrible, awful, broken) file formats, docx and pdf.
English
0
0
0
25
Sol Irvine
Sol Irvine@solirvine·
What's still missing: - Plan, refine, implement loops - Delegation to task-oriented agents - Inline human-in-the-loop approvals & clarifications - Branching/forking, versioning - Structured memory at the project/user/client/practice/firm levels - QA reviews, red-teaming, linting
English
1
0
0
33
Sol Irvine
Sol Irvine@solirvine·
For me, the revelation from Mike is how *thin* the underlying product is. 1. Chat = chat with your docs. 2. Project = folder-scoped chat. 3. Tabular Review = projects x prompts. 4. Workflow = prompts. Credit to @willchen500 for packaging and framing it brilliantly.
English
1
0
0
82
Sol Irvine
Sol Irvine@solirvine·
@T1000_V2 Works very well for my purposes, but it depends on the models you use, the prompts you give them, and the type of contract. Using GPT 5.4 mini on typical commercial contracts, I'm impressed.
English
0
0
0
14
Sol Irvine
Sol Irvine@solirvine·
My new app wargame.esq pits two agents against each other in a contract negotiation. Each agent reviews the contract. They assemble a shared issues list. Then they negotiate each point, showing their internal reasoning and back-and-forth in real time.
Sol Irvine tweet media
English
30
40
558
184.1K
Sol Irvine
Sol Irvine@solirvine·
@afterlanie The agents already adopt the side they represent and pursue its interests. Before the negotiation starts, you can steer them in terms of their approach, e.g., "be conciliatory, but concede anything that will be onerous or unreasonably limiting for us".
English
0
0
0
17
Lāniē
Lāniē@afterlanie·
@solirvine have you experimented with each agent taking separate tactics to compare the outcome of the contract?
English
1
0
1
150
Sol Irvine
Sol Irvine@solirvine·
@aref_vc Most routine commercial contracts are < 100 pages. The app only negotiates one contract at a time, and uses a markdown extraction, so context windows aren't really an issue. Newer frontier models do a good job of caching, which keeps costs reasonable, too.
English
0
0
1
16
❈Aref❈
❈Aref❈@aref_vc·
Love it. I'm wondering about performance-wise and speed of execution, especially with context coherence and taking dozens, if not hundreds, of pages at the same time, including parsing capabilities and being faithful to the control guard rails to remove any possible hallucination. I've seen a lot of variability and variance between already frontier models, commercial ones, but also open-source ones. At least running this locally, you can see the trend and the requirements to make it handle a lot of H cases, especially if it's fintech, legal, or real estate legal, etc. Those are things a mix of expert-type models would bring a bit more depth, but if you choose to work with a more generalist model, things could be trickier.
English
1
0
0
21
Sol Irvine
Sol Irvine@solirvine·
@aref_vc I have another product augustus.esq that can evaluate the negotiated draft from a neutral perspective. I did consider integrating a neutral pass at the end, but never got around to it. I like the idea of establishing some persistence/reputation for the agent over time.
English
1
0
0
114
❈Aref❈
❈Aref❈@aref_vc·
This is awesome with the council framework to push back, etc. What's the final line of judgment? Who basically confirms? Is that basically the end user, or is there an LLM as a judge on top of it? I did run a few experiments recently on top of this. Initially, instead of a council, we do a bidding approach where the LLMs pick based on their confidence in solving or winning the negotiation. There is a reputation token and a budget loaded into that conversation. If you win the conversation or the negotiation, you get more points and more reputation, which helps you win more deals. There is an LLM as a judge that comes later on to cover these. It could be at the close level, or at the full agreement level. There are different angles to it, with pros and cons in terms of where it's best fit and where the maximum impact surface is taking place.
English
1
0
1
151
Sol Irvine
Sol Irvine@solirvine·
@futureproof_amy To me, this tool is a more robust expression of the analysis that I do when confronted with a contract. Anticipate the issues and arguments, get a sense of a reasonable middle ground, etc.
English
1
0
0
79
Sol Irvine
Sol Irvine@solirvine·
@futureproof_amy "Law is not to be gamified." To me, the AI-generated output is only useful as a benchmark for: - Which issues are raised. - The arguments in both directions. - Which compromises are reached. - The rationale for concessions, etc.
English
2
0
3
462
闵魁偲
闵魁偲@cmiller11101·
@solirvine It seems this doesn't work yet? All I see is your landing page showing a brief text intro.
闵魁偲 tweet media
English
1
0
1
424
Sol Irvine
Sol Irvine@solirvine·
@p_dove Yes, we have a (very rudimentary) version of this.
Sol Irvine tweet media
English
1
0
3
349
Paloma A.
Paloma A.@p_dove·
This is so interesting, would be especially helpful for a negotiation against a new/unknown counterparty. For known counterparties (or at least, people your colleagues have told you about) it'd be cool to be able to input their well-known bugaboos to see how it hashes out with some tuning.
English
1
0
3
745
Sol Irvine
Sol Irvine@solirvine·
@BaricJohnpaul I built it for myself initially, but given the response I'll release something in the next week or so.
English
0
0
1
109
Sol Irvine
Sol Irvine@solirvine·
@horadrimsage I know you're kidding, but we did have to implement an (adjustable) cap on turns to protect against cycles/digressions. Depending on which models you use, it might cost you some fraction of a junior associate's hourly rate. ;)
English
0
0
1
216
Drew Jenkins
Drew Jenkins@horadrimsage·
@solirvine Do you have a NY biglaw mode where the models just keep redlining with the comment "this is standard" and billing +2k/hr for weeks straight?
English
1
0
10
286
Sol Irvine
Sol Irvine@solirvine·
@flseeh There’s a short interview at the start. Which party(ies) do you represent? Whose draft is it? Open-ended inputs for: (1) your goals, constraints, etc. and (2) context about the counterparty—e.g., big company, inflexible, needs revenue, etc.
English
0
0
1
1.4K
Florian Seeh
Florian Seeh@flseeh·
@solirvine Super interesting! How does the briefing of agents work? How would they know what’s my red line? Thanks for sharing your thoughts :)
English
1
0
1
1.7K
Sol Irvine
Sol Irvine@solirvine·
@kourouklides 1. The transcript of the negotiation is a very good roadmap for what to expect in the real world. Not only which issues are likely to emerge, but also the arguments in both directions. 2. The redline is helpful for baselining a compromise. 3. Memo is a useful executive summary.
English
1
0
4
973
Sol Irvine
Sol Irvine@solirvine·
@gavinyerxa I haven’t seen that issue in my tests. I find the edits are targeted well with anything above Haiku/GPT5.4-mini. Where does this happen for you?
English
0
0
0
2.1K
gavin yerxa
gavin yerxa@gavinyerxa·
@solirvine very cool! Have you been able to get the agents to avoid the over editing problem (where instead of making a one word change they replace an entire paragraph)? Does two agents negotiating against each other help there?
English
1
0
2
2.8K
Sol Irvine
Sol Irvine@solirvine·
@originalmagneto Using OpenAI or Anthropic via API. I view the API as just another SaaS with access to sensitive files. You?
English
1
0
3
3K
Majo
Majo@originalmagneto·
@solirvine What’s the model? Is it using API? How is your stance on Cloud Act, GDPR and data residency and ZDR?
English
1
0
1
3.4K
Sol Irvine
Sol Irvine@solirvine·
@andrewarruda Thanks, Andrew! Still early going for this app, but it's already much more useful than the standard "chat with your docs" template.
English
1
0
2
64