Jeffrey Wang

2.1K posts

Jeffrey Wang banner
Jeffrey Wang

Jeffrey Wang

@jeffzwang

cofounder @exaailabs

San Francisco Katılım Ocak 2020
1.3K Takip Edilen17.4K Takipçiler
Sabitlenmiş Tweet
Jeffrey Wang
Jeffrey Wang@jeffzwang·
We raised $85M to continue building the best search engine for AIs. Why a search engine for AIs? LLMs are trained on the internet but they can’t memorize it - they need search. But traditional search engines were built for humans. LLMs are now ingrained in thousands of popular apps with thousands of different use cases. These LLMs need a search engine that was built with this new world in mind. Today, Exa powers search for companies like Cursor and Notion. We're also the default search provider in OpenAI's opensource gpt-oss models, and inference providers like Openrouter and Groq. We have the fastest search API (search type "fast"), most comprehensive search (Websets product), and even build custom web search solutions for individual customer needs (just email us). We also do more than just "search" - we support crawling, web-grounded AI answers, deep research, and even finding huge verified lists of any web data you want (Websets). And if it's not the highest quality, most developer friendly, easiest to use search API out there, give me a holler. LESGO
Exa@ExaAILabs

We raised $85M in Series B funding at a $700M valuation, led by Benchmark. Exa is a research lab building the search engine for AI.

English
65
21
499
258.8K
Jeffrey Wang
Jeffrey Wang@jeffzwang·
exa is at gtc! hit us up
Jeffrey Wang tweet media
English
3
2
58
3K
Jeffrey Wang
Jeffrey Wang@jeffzwang·
@WilliamBryk i think it’s more that the hardware pricing molded to the economic feasibility
English
1
0
18
819
Will Bryk
Will Bryk@WilliamBryk·
strange that the compute cost for training AGI ended up being exactly the order of magnitude of money a private company can feasibly spend (~100 billion dollars). If it were 10x off in either direction, we'd be on a very different timeline
English
11
0
80
6.2K
Jeffrey Wang
Jeffrey Wang@jeffzwang·
does anyone have any tips on how to prompt/plan when trying to oneshot large projects, like 50K+ LOC?
English
92
0
88
47K
Jeffrey Wang
Jeffrey Wang@jeffzwang·
@eve_builds I think there needs to be both - anything that prevents context pollution is good thing! I also have a slash command that dynamically generates a full set of MCPs, tools, and skills in a new Claude Code session, so basically fully custom, concise plugins
English
0
0
1
86
Eve Park
Eve Park@eve_builds·
the tool count thing is so real. i've seen MCPs that dump like 40+ tools into context and the agent just gets confused about which one to use. your MCP proxy idea is interesting though. do you think that filtering should live in the harness or is it better to have the MCP itself be smarter about what it exposes based on what the agent is actually doing?
English
1
0
2
62
Jeffrey Wang
Jeffrey Wang@jeffzwang·
In my opinion, MCP is not redundant and bad performance owes mainly to skill issue: Redundancy The most important leverage an agent can have is amazing context engineering, and CLIs/APIs simply aren't designed to expose precise, concise interfaces for specific use cases. To achieve that, you need a client-server relationship where the client (agent) tells a server about itself and the server exposes an intelligent selection of capabilities (tools), or if the client (agent) post-filters the set of tools (in MCP, what is returned by `initialize`). Well, MCP is such a technology! Could you hack CLIs/APIs to achieve the same thing? Sure, but they're not designed for selective exposure that minimizes an agent's context pollution. You need some new protocol layer thing that allows for that, even if that just wraps CLIs/APIs, because of this agent-specific problem. MCP! Performance Most of the performance issues I see in MCPs owe to bad implementation. That they promote good context engineering is a moot point if they're designed such that context gets mega polluted, and unfortunately many MCPs mega pollute context. One way this happens is exposing way too many tools - it's pretty common right now for API-wrapping MCPs to just expose every single API endpoint and request field, with some MCPs in the tens of thousands of tokens size because of this. This wouldn't be an issue if agent harnesses allowed for selective tool filtering per MCP, but most agent harnesses (e.g., Claude Code) don't support this. Also developers are not mindful enough of tool call response token payloads, which also need to be minimized to avoid context pollution - it is all too common for responses to constitute of the full API response payload. This has been partially mitigated by the rise of subagents but is still a major issue that often APIs/CLI based tool implementations may not have because in those implementations there are existing patterns to filter payloads. In my personal Claude Code setup, I address these problems by 1) implementing an "MCP proxy" layer that lets me manually remove tools from MCPs, 2) implementing post tool call hooks that filter response payloads. And in the Exa MCP, we are careful to 1) return only fields that an agent needs 2) filter parsed HTML to only relevant tokens using models that we train. But these features need to be built into MCPS and agent harnesses as first class citizens. Lastly, I will note that it is obviously the case that agents know how to use APIs/CLIs that are in their training data very well, almost always better than MCPs. But that issue should get fixed over time as LLMs get trained on more MCP data. Overall I tend to be relatively pro when it comes to new shit that needs to be built specifically for AIs (after all, our company is in the business of building a search engine from scratch for AIs lol). But this is my experience as someone who has thought about this way too much. Curious for people's thoughts! Maybe I am missing something.
English
22
13
141
13K
Jeffrey Wang
Jeffrey Wang@jeffzwang·
@kishan_dahya imo the question is whether there needs to be a new standard, and i think the answer is pretty obviously yes because of the specific needs of agents
English
1
0
1
318
kishan
kishan@kishan_dahya·
@jeffzwang but if you have to build bespoke functionality to make a "standard" work then it's probably not a great standard. llms are getting better at tooling calling and more harnesses are shipping with tool-search now which alleviates the problem a bit but overall, i'm not a fan
English
1
0
0
279
Jeffrey Wang
Jeffrey Wang@jeffzwang·
@ivanleomk claude code eval'ed claude in chrome to be better, although obv it is biased and also maybe rl'ed
English
0
0
1
252
Ivan Leo
Ivan Leo@ivanleomk·
Damn agent browser kinda good ngl
English
10
0
30
6K
zek
zek@zekramu·
@jeffzwang lotta ppl say this but like does it actually work?
English
2
0
0
129
zek
zek@zekramu·
how 2 fix bad knees? unc (me) is very athletic can jump out the gym and is very explosive but next day unc is like a crippled war vet. should I stat injecting peptides in my knees or someshit???
English
46
0
63
12.9K
Jeffrey Wang
Jeffrey Wang@jeffzwang·
Do you read technical docs anymore?
English
5
0
8
1.5K
Jeffrey Wang
Jeffrey Wang@jeffzwang·
New obsession is one-shotting flash games from my childhood and then playing them. It's also a great coding agent eval!
English
0
0
9
540