Ben Huang

77 posts

Ben Huang

@b3nhuang

Working on new things. Currently @ThematicAI @HiBaseStation, @side_realestate, @necto_inc, @groove_co. @ycombinator alum

New York City Katılım Mart 2021

613 Takip Edilen190 Takipçiler

Ben Huang@b3nhuang·3d

@karinadoteth Same. Weird at first, eventually you start asking "what do we think of this" to your swarm

English

Karina Q@karinadoteth·3d

every morning i read quotes from my sim (we do an overnight run most nights) w my morning tea its basically like reading synthesis / discussion of 200+ research analysts specialized in the like 130~ish now capex companies we r tracking across all slices of the capex supply chain sometimes i trade off their discussions, yes, lol sometimes i just like to see wut they r talking abt haha

English

128

Ben Huang@b3nhuang·3d

@karinadoteth Amazing!

English

Karina Q@karinadoteth·3d

@b3nhuang

QAM

Karina Q@karinadoteth·18 Nis

♥️ qwen

Magyar

102

Ben Huang@b3nhuang·3d

Announcing @ThematicAI ! Thematica is a research project where the goal is to build a fully autonomous long term investor name Simon. 🟢Today I'm kicking off the launch of Simon in research preview dry run (to make sure everything is working)[ follow along at agent.thematica.ai or @ThematicaSimon] Simon is a custom agent designed to be a long-term investor, not a day trader. He reasons in multiple layers with multiple agent teammates. Simon's goal isn't to trade more. It's to find the best things to own medium to long term. Simon will runs continuous research cycles. reading the world, updating theses, manage a watchlist, and decide what to hold. Along the way I've built a ton of tools: agentic financial research tools, agentic research teams etc. to let Simon actually navigate and research about markets end-to-end. This week is the dry run. Next week I'll kick off the real research run. Stay tuned

English

931

Ben Huang@b3nhuang·4d

opus 4.7 loves the word "idempotent"

English

Ben Huang@b3nhuang·24 Nis

@ReserveList Gona make a precious metal bench first then i'll come back around it. In the mean time feel free to add.

English

🍊Brüçe d'Orange@ReserveList·15 Nis

@b3nhuang Okay two weird things, [1] I love how even LLMs prove that over trading is problematic [2] why didn't you include the anthropic haiku model?

English

143

Ben Huang@b3nhuang·15 Nis

Just for fun I made a benchmark of the models trading oil. I ran 9 frontier LLMs trading from 1/1 => 3/14 Oil was up 72% so none of the model beat buy-and-hold. Best: Gemini 3-flash ($15,880) Worst: Minimax ($14,619) Most consistent predictor: Claude Opus, but ranked 8th on P&L. Accuracy nor consistency ≠ trading performance . Here's the result → benhuang21828.github.io/oil-bench 🔊 out to @OpenRouter for credits and @alexatallah for feedback

English

8.9K

Ben Huang@b3nhuang·24 Nis

@dankalski Its news heavy. None of the models had enough skepticism. (aka. aware of the 🌮)

English

Daniel Kalski@dankalski·16 Nis

An ok experiment in showing each model's reasoning. The March 9 tick interests me since every model clustered $93-96 against an $83.45 close. $10 misses driven by headlines, pulling everyone bullish while the actual move was mean reversion. Was your data-input design intentional to be news-heavy? Or is there time-series data feeding that isn't shown per tick/day?

English

Ben Huang@b3nhuang·24 Nis

@agupta Made a benchmark recently with less meme worthy, but possibly more useful insight x.com/b3nhuang/statu…

Ben Huang@b3nhuang

English

994

Ankit Gupta@agupta·24 Nis

soooooo how are we feeling about those quant jobs everyone?

Kamryn Ohly@KamrynOhly

Our team is stunned. We gave Claude Opus 4.6 by @AnthropicAI $10k to trade on @Polymarket. It’s now has an account value of $70,614.59. This is a new era of model performance in trading and predicting outcomes in the face of uncertainty. @predictionbench

English

1.3K

694.9K

Ben Huang@b3nhuang·22 Nis

Claude Design is made for coders who never learned how to drag on Figma. Brilliant!

English

Ben Huang@b3nhuang·15 Nis

So we ran each run 10 times per model. Intersting points: - The smartest models aren't the best at trading. - Anthropic is leagues above everyone else when it comes to their prediction consistency. - All the chinese models suffered from inconsistency and poor prediction quality.

English

285

Ben Huang@b3nhuang·10 Kas

@sughanthans1 @shashankgoyal95 @_philschmid @SUghlu We do exactly this!

English

Sughu@sughanthans1·9 Kas

@shashankgoyal95 @_philschmid @SUghlu I would like to deploy a claude agent that can fill PDF files for me. So send pdf file to an endpoint + info on what needs to be filled -> get back the filled in pdf

English

Philipp Schmid@_philschmid·8 Kas

TIL: Claude Code local sandbox environment is open-source. > native OS sandboxing primitives (sandbox-exec on macOS, bubblewrap on Linux) and proxy-based network filtering. It can be used to sandbox the behaviour of agents, local MCP servers, bash commands and arbitrary processes.

English

720

62.2K

Ben Huang retweetledi

Alex Atallah@alexatallah·25 Haz

Excited to announce a $40M raise for @openrouter (seed + A), led by a16z & Menlo! LLM inference will be the biggest software market in the world. We've become the #1 control plane. Here's what's next:

English

200

125

2.3K

456.7K

Ben Huang retweetledi

basestation@HiBaseStation·16 Nis

👋 Just to share a new product we've been building here at BaseStation – something born directly from conversations we've had with many of you. We initially started exploring what the next generation of document processing, "e-sign 2.0," could look like. We quickly learned something crucial: most companies placed trust of their e-signature tools above usability, the real headache isn't the signature execution step. It's everything before that. It's a long and frustrating process of setting up form templates and getting the right data into them before they even go out for signature. We noticed three key challenges: 1️⃣ Bulk Form Filling: Teams need the ability to fill out the same form thousands of times, often for each of their users. 2️⃣ Ability to Update Entire Document Packet: When users need to update one of their information, they need to update all the fields using that data point across the entire document packet. 3️⃣ Maintaining Custom Apps and No-Code Tools: To enable the ability to bulk fill forms and bulk update filled-out forms, large teams maintain a ton of custom apps and no-code tools. 💡 So, How does BaseStation solve these challenges? We focused on making this process incredibly simple: 🥇 Smart Setup: Upload your document. Our AI automatically identifies and place input fields. We will even generates draft instructions in plain English for how to fill them out. Need more specific instructions for a complex field? Just click on it and edit the instruction text. Explain how you would fill it out just like how you would explain it to another person. 🥈 Flexible Data Connection: Build your "data canvas" – the information AI will use to quickly and easily fill out your forms. You can: Upload a previously filled-out version of the document, and we'll automatically extract the relevant data points. Connect directly to existing data sources like Google Sheets or Airtable. Simply type key:value pairs (like Name: John Doe, Address: 123 Main St) directly into our data canvas text area. 🥉 One-Click Autofill: Once your template is set up and your data is connected, just click "Auto Fill." Watch as our AI uses your data canvas to intelligently populate even long, complex document packets in seconds. 🔊 If your team is still spending thousands of hours and significant resources setting up templates and building custom connectors just to auto-fill your forms, we believe there's a better way. We would love to show you what we're building here at BaseStation. 🔊

English

187

Ben Huang@b3nhuang·10 Mar

@VictorPontis seriously, wtf.

English

Victor Pontis@VictorPontis·6 Mar

I'm still so disappointed that Google sold Google Domains to Squarespace. I've been managing domains expiring and incorrect billing details every week for the past year... It sucks.

English

658

Ben Huang@b3nhuang·10 Mar

The thing about configuring ai agents is, sometimes it will ignore your instructions and you won’t know why. You’ll just have to suffer watching in keep ignoring it. its like hiring the fastest and smartest coder, but they happened to also be extremely stubborn

English

112

Ben Huang retweetledi

Garry Tan@garrytan·24 Oca

Don’t just lie flat on the ground because AGI is here and ASI is coming. Your hands are multiplied. Your ideas must be brought into the world. Your agency will drive the machines of loving grace. Your taste will guide the future. To the stars.

English

220

533

4.8K

488K

Ben Huang@b3nhuang·4 Ara

@diegozaks @loom Not sure if you have this use case, but I build a loom thats tailored made for walking through documents. Would love your thoughts on it @HiBaseStation hellobasestation.com

English

Diego Zaks@diegozaks·4 Ara

@loom Clarification: They are killing the "viewer" account, so if you want to restrict viewers behind SSO you'll have to get your entire company a full seat.

English

3.4K

Ben Huang retweetledi

basestation@HiBaseStation·9 Eki

Here's what we've shipped in September based on feedback from you, our users. If you would like to see a feature on our roadmap definitely reach out! Team Management ▶️ Create Teams: Create teams and add your teammates to reflect your organization's structure. ▶️ Team Member Roles: Assign "member" and "admin" roles to your teams. Admins can manage team membership. ▶️ Share Templates: Clone your personal templates to team workspaces to share with your teammates. Mobile Optimization ▶️ PDF Viewer: Supporting mobile gestures, pinch-to-zoom, drag-to-move, and spread-to-zoom gestures. ▶️ Video Playback: Videos now autoplay and play inline for a seamless mobile experience. ▶️ UI Refinements: Making our mobile experience look awesome! Contact Cards ▶️ Create actionable contact cards: Contact cards with quick actions to encourage engagement from your customers. ▶️ Add Information: Include phone numbers, email addresses, and scheduling links for easy communication. Self-Service Plans ▶️ Basic: Create unlimited guides with 5 hours of video storage and monthly streaming. ▶️ Pro: 10 hours of video storage and streaming, plus team creation capabilities. Expiring Links ▶️ Protect Sensitive Information: Set expiration dates to protect time-sensitive information. New Landing Page 💻We've launched a brand-new landing page! Would love to know what you think.

English

217

Ben Huang retweetledi

Shantanu Joshi@joshishantanu4·4 Eyl

Introducing Savvy Teams: Share hard-earned insights securely within your organization. Search and run any insight shared with the team without leaving your terminal. Learn from your teammate's without waking anyone up. Link to our docs in the next tweet!

English

852

Ben Huang@b3nhuang·27 Haz

@jbfja blown away by the speed of @SupermavenAI. How is it so fast?

English

Keşfet

@karinadoteth @ThematicAI @ThematicaSimon @ReserveList @OpenRouter @alexatallah @dankalski @agupta