Jack Brown

51 posts

Jack Brown banner
Jack Brown

Jack Brown

@jackbr513

automating testing @getlark (YC S25). prev @stripe. climb with me https://t.co/fN5BghnSes

san francisco 가입일 Ağustos 2025
33 팔로잉68 팔로워
Jack Brown 리트윗함
Vijit Dhingra
Vijit Dhingra@VijitDhingra1·
We just shipped QA Reports. Deploy AI agents to test your product, uncover issues, and get back a structured report with reproducible test cases. We ran it on the Vercel Sandboxes API. It mapped test areas, executed flows, and surfaced real issues in minutes. Try it: getlark.ai Example report (Vercel Sandboxes API): tests.getlark.ai/a/kiwi/qa-repo…
English
2
2
9
154
Jack Brown
Jack Brown@jackbr513·
@bcherny We're launching bug fixes as features now?
English
0
0
0
375
Boris Cherny
Boris Cherny@bcherny·
Today we're excited to announce NO_FLICKER mode for Claude Code in the terminal It uses an experimental new renderer that we're excited about. The renderer is early and has tradeoffs, but already we've found that most internal users prefer it over the old renderer. It also supports mouse events (yes, in a terminal). Try it: CLAUDE_CODE_NO_FLICKER=1 claude
Curt Tigges@CurtTigges

@bcherny @UltraLinx please at least fix the uncontrollable scrolling/flickering before the next 3000 features

English
637
680
10.1K
2.6M
Tenobrus
Tenobrus@tenobrus·
@Jason shut the fuck up jason you are culpable in all of this. stand by your staggering fucking retardation in pretending this administration was ever going to be capable of anything but setting fire to the country
Tenobrus tweet media
English
4
0
366
3.7K
Jack Brown
Jack Brown@jackbr513·
The leaked claude code src has a real comment about optimizing launch timing for "sustained Twitter buzz" while being "gentler on soul-gen load"
Jack Brown tweet media
English
0
0
1
50
Rivet
Rivet@rivet_dev·
Say hello to agentOS (beta) A portable open-source OS built just for agents. Powered by WASM & V8 isolates. 🔗 Embedded in your backend ⚡ ~6ms coldstarts, 32x cheaper than sbxs 📁 Mount anything as a file system (S3, SQLite, …) 🥧 Use Pi, Claude Code/Codex/Amp/OpenCode soon
English
58
72
1.1K
251.4K
Jake
Jake@JustJake·
@Gabe_Kauffman First of all, deepest apologies. We've been scaling insanely through demand I had half a mind to literally just *close signups* growth is so high Our whole next quarter is "Making the experience better" Faster deploys, more reliability, etc
English
6
0
41
6.8K
Jake
Jake@JustJake·
Today we had an issue affecting ~3000 users, where their authenticated content may have been served to their unauthenticated users Below is our writeup on impact, resolution, and prevention We've deeply sorry. This is unacceptable and we will do better blog.railway.com/p/incident-rep…
English
64
28
561
100.8K
Jack Brown
Jack Brown@jackbr513·
We just launched Repairs at Lark. Ship a UI change. End-to-end tests break. Lark's QA agents automatically fix them. No more test maintenance bottlenecks. Just turn on auto-repairs and keep shipping. Try it → getlark.ai
English
0
3
3
180
Jack Brown
Jack Brown@jackbr513·
@zeeg Thanks for speaking up. SF holds an immense amount of political and financial power and. We can fight back against this if we stand together and that includes the tech industry.
English
0
0
2
171
Jack Brown
Jack Brown@jackbr513·
@DanielLurie "We have no reason to believe there is broader federal immigration enforcement at SFO" So are you removing ICE and protecting your constituents or are you going to keep your head in the sand?
English
8
0
11
1.3K
Daniel Lurie 丹尼爾·羅偉
Daniel Lurie 丹尼爾·羅偉@DanielLurie·
Like many San Franciscans, I found the incident at SFO last night upsetting. I have spoken to leaders at SFO and SFPD, and we believe this is an isolated incident. We have no reason to believe there is broader federal immigration enforcement at SFO. SFPD officers remained at the scene to maintain public safety and were not involved in the incident. Under our city's longstanding policies, local law enforcement does not participate in federal civil immigration enforcement. Those policies keep us safe and will not change as long as I'm mayor.
San Francisco Police@SFPD

SFPD Statement on Sunday’s Incident at SFO

English
283
29
252
204K
erin griffith
erin griffith@eringriffith·
A detailed and brutal look at the tactics of buzzy AI compliance startup Delve "Delve built a machine designed to make clients complicit without their knowledge, to manufacture plausible deniability while producing exactly the opposite." substack.com/home/post/p-19…
English
202
408
4.7K
4.5M
Aakash Gupta
Aakash Gupta@aakashgupta·
Cursor is raising at a $50 billion valuation on the claim that its “in-house models generate more code than almost any other LLMs in the world.” Less than 24 hours after launching Composer 2, a developer found the model ID in the API response: kimi-k2p5-rl-0317-s515-fast. That’s Moonshot AI’s Kimi K2.5 with reinforcement learning appended. A developer named Fynn was testing Cursor’s OpenAI-compatible base URL when the identifier leaked through the response headers. Moonshot’s head of pretraining, Yulun Du, confirmed on X that the tokenizer is identical to Kimi’s and questioned Cursor’s license compliance. Two other Moonshot employees posted confirmations. All three posts have since been deleted. This is the second time. When Cursor launched Composer 1 in October 2025, users across multiple countries reported the model spontaneously switching its inner monologue to Chinese mid-session. Kenneth Auchenberg, a partner at Alley Corp, posted a screenshot calling it a smoking gun. KR-Asia and 36Kr confirmed both Cursor and Windsurf were running fine-tuned Chinese open-weight models underneath. Cursor never disclosed what Composer 1 was built on. They shipped Composer 1.5 in February and moved on. The pattern: take a Chinese open-weight model, run RL on coding tasks, ship it as a proprietary breakthrough, publish a cost-performance chart comparing yourself against Opus 4.6 and GPT-5.4 without disclosing that your base model was free, then raise another round. That chart from the Composer 2 announcement deserves its own paragraph. Cursor plotted Composer 2 against frontier models on a price-vs-quality axis to argue they’d hit a superior tradeoff. What the chart doesn’t show is that Anthropic and OpenAI trained their models from scratch. Cursor took an open-weight model that Moonshot spent hundreds of millions developing, ran RL on top, and presented the output as evidence of in-house research. That’s margin arbitrage on someone else’s R&D dressed up as a benchmark slide. The license makes this more than an attribution oversight. Kimi K2.5 ships under a Modified MIT License with one clause designed for exactly this scenario: if your product exceeds $20 million in monthly revenue, you must prominently display “Kimi K2.5” on the user interface. Cursor’s ARR crossed $2 billion in February. That’s roughly $167 million per month, 8x the threshold. The clause covers derivative works explicitly. Cursor is valued at $29.3 billion and raising at $50 billion. Moonshot’s last reported valuation was $4.3 billion. The company worth 12x more took the smaller company’s model and shipped it as proprietary technology to justify a valuation built on the frontier lab narrative. Three Composer releases in five months. Composer 1 caught speaking Chinese. Composer 2 caught with a Kimi model ID in the API. A P0 incident this year. And a benchmark chart that compares an RL fine-tune against models requiring billions in training compute without disclosing the base was free. The question for investors in the $50 billion round: what exactly are you buying? A VS Code fork with strong distribution, or a frontier research lab? The model ID in the API answers that. If Moonshot doesn’t enforce this license against a company generating $2 billion annually from a derivative of their model, the attribution clause becomes decoration for every future open-weight release. Every AI lab watching this is running the same math: why open-source your model if companies with better distribution can strip attribution, call it proprietary, and raise at 12x your valuation? kimi-k2p5-rl-0317-s515-fast is the most expensive model ID leak in the history of AI licensing.
Harveen Singh Chadha@HarveenChadha

things are about to get interesting from here on

English
247
549
4.4K
1.4M
Celsius 233
Celsius 233@Celsius233Books·
@eringriffith “Their “US-based auditors” are Indian certification mills” - this is all the US based service companies (startup or not)
English
2
0
31
7.2K
Ryan
Ryan@ohryansbelt·
Delve, a YC-backed compliance startup that raised $32 million, has been accused of systematically faking SOC 2, ISO 27001, HIPAA, and GDPR compliance reports for hundreds of clients. According to a detailed Substack investigation by DeepDelver, a leaked Google spreadsheet containing links to hundreds of confidential draft audit reports revealed that Delve generates auditor conclusions before any auditor reviews evidence, uses the same template across 99.8% of reports, and relies on Indian certification mills operating through empty US shells instead of the "US-based CPA firms" they advertise. Here's the breakdown: > 493 out of 494 leaked SOC 2 reports allegedly contain identical boilerplate text, including the same grammatical errors and nonsensical sentences, with only a company name, logo, org chart, and signature swapped in > Auditor conclusions and test procedures are reportedly pre-written in draft reports before clients even provide their company description, which would violate AICPA independence rules requiring auditors to independently design tests and form conclusions > All 259 Type II reports claim zero security incidents, zero personnel changes, zero customer terminations, and zero cyber incidents during the observation period, with identical "unable to test" conclusions across every client > Delve's "US-based auditors" are actually Accorp and Gradient, described as Indian certification mills operating through US shell entities. 99%+ of clients reportedly went through one of these two firms over the past 6 months > The platform allegedly publishes fully populated trust pages claiming vulnerability scanning, pentesting, and data recovery simulations before any compliance work has been done > Delve pre-fabricates board meeting minutes, risk assessments, security incident simulations, and employee evidence that clients can adopt with a single click, according to the author > Most "integrations" are just containers for manual screenshots with no actual API connections. The author describes the platform as a "SOC 2 template pack with a thin SaaS wrapper" > When the leak was exposed, CEO Karun Kaushik emailed clients calling the allegations "falsified claims" from an "AI-generated email" and stated no sensitive data was accessed, while the reports themselves contained private signatures and confidential architecture diagrams > Companies relying on these reports could face criminal liability under HIPAA and fines up to 4% of global revenue under GDPR for compliance violations they believed were resolved > When clients threaten to leave, Delve reportedly pairs them with an external vCISO for manual off-platform work, which the author argues proves their own platform can't deliver real compliance > Delve's sales price dropped from $15,000 to $6,000 with ISO 27001 and a penetration test thrown in when a client mentioned considering a competitor
Ryan tweet media
erin griffith@eringriffith

A detailed and brutal look at the tactics of buzzy AI compliance startup Delve "Delve built a machine designed to make clients complicit without their knowledge, to manufacture plausible deniability while producing exactly the opposite." substack.com/home/post/p-19…

English
401
723
8.2K
5.7M
Jack Brown
Jack Brown@jackbr513·
Running agents in @e2b sandboxes just got easier. Quickstart: runtimeuse.com/quickstart
Vijit Dhingra@VijitDhingra1

Run agents like Claude Code in any sandbox. We just open sourced runtimeuse.com - a framework for running agents in sandboxes without building all the infra yourself. Agents + sandboxes are powerful. The hard part is everything around them: - streaming messages - file exchanges - cancelling jobs mid-run, etc We hit this while building @getlark, so we open sourced our runtime. Try it: `npx -y runtimeuse --agent=claude` Feedback welcome 👇

English
1
2
13
3.3K