Mati

880 posts

Mati banner
Mati

Mati

@MatiBuildsWith

Exploring what's possible with AI tools. Sharing the wins, the fails, and what's actually worth your time. Building in public.

Everywhere Katılım Ağustos 2015
140 Takip Edilen98 Takipçiler
Mati
Mati@MatiBuildsWith·
Be honest: do you actually trust AI for important decisions yet, or are you double-checking everything and just using it for the tedious parts?
English
0
0
0
2
Mati
Mati@MatiBuildsWith·
VCs and founders are inflating ARR to crown AI startups. Annualize a $10k pilot → "$120k ARR." Stack 50 of those, raise at $50M. Then act surprised when the company can't grow into the number. The AI bubble isn't hype. It's accounting.
English
0
0
0
2
Mati
Mati@MatiBuildsWith·
Write a context file for your AI coding assistant once. Stack, conventions, things that keep going wrong. Claude reads it automatically. Cursor picks it up. Stops you from re-explaining your codebase every session. CLAUDE.md or .cursorrules — start one.
English
1
0
1
2
Mati
Mati@MatiBuildsWith·
The Pope published an encyclical on AI this weekend. His concern: it concentrates power instead of distributing it. The most honest critique of the AI industry this month didn't come from a regulator or researcher. It came from the Vatican.
English
0
0
0
1
Mati
Mati@MatiBuildsWith·
Google says they fundamentally transformed search. Users say they can't even Google normally anymore. Are you actually noticing this or is it just power users upset their shortcuts broke?
English
0
0
0
3
Mati
Mati@MatiBuildsWith·
Anthropic is hitting its first profitable quarter. "Safety first" didn't kill the business. It basically became the business.
English
0
0
0
2
Mati
Mati@MatiBuildsWith·
When testing a new AI tool, skip the clean demo prompts. Give it your messiest, most ambiguous task. That's the only real test.
English
0
0
0
2
Mati
Mati@MatiBuildsWith·
OpenAI says it solved an 80-year-old math problem using AI. If true, this isn't a product update. It's a different era of what science looks like.
English
0
0
0
8
Mati
Mati@MatiBuildsWith·
@emollick The harder bottleneck in social science isn't compute — it's that good causal measurement frameworks took 50 years of RCT infrastructure to build. AI can run the analysis. It still can't tell you if you asked the right question.
English
0
0
0
3
Ethan Mollick
Ethan Mollick@emollick·
Math is easy* because it has verifiable outputs and few messy judgement choices to make. Which AI labs have the guts to make advancing social science a priority? It may actually do more for human flourishing to unlock sociology, econ & psych reseach. * For AIs, not for humans
English
85
37
519
107.7K
Mati
Mati@MatiBuildsWith·
@sama What's strange is we got the answer before the intuition. Mathematicians usually have both arrive together — the proof and the 'aha.' Now the 'aha' comes after, if at all.
English
0
0
0
3
Mati
Mati@MatiBuildsWith·
The new productivity tax: spending 20 minutes figuring out which AI to use for a 5-minute task.
English
0
0
0
4
Mati
Mati@MatiBuildsWith·
Someone built an open-source tool that auto-opts you out of 500+ data broker sites. One script, zero manual forms. There's an entire industry making privacy intentionally painful. Tools like this are the actual answer.
English
0
0
0
2
Mati
Mati@MatiBuildsWith·
AI bug hunters have made the Linux kernel security mailing list 'almost entirely unmanageable.' We built tools to find vulnerabilities faster than humans can patch them. Congrats, I guess.
English
0
0
0
3
Mati
Mati@MatiBuildsWith·
Mozilla used Claude to audit Firefox. Security bug fixes jumped from ~25/month to 423 in April. Including 20-year-old bugs. The code was always like this. We just couldn't review it fast enough.
English
0
0
0
2
Mati
Mati@MatiBuildsWith·
ArXiv banning researchers for a year if they let AI write their papers. The same institutions that charge $3k to publish and make you sign away your rights suddenly care about authenticity. Sure.
English
0
0
0
4
Mati
Mati@MatiBuildsWith·
Running models locally on Apple Silicon feels powerful until you do the math. OpenRouter is cheaper per token and faster to set up. The "local AI" romance is real. The economics usually aren't.
English
0
0
0
3
Mati
Mati@MatiBuildsWith·
ChatGPT now connects to your bank account. OpenAI went from helpful assistant to knowing more about your money than your accountant. Fintech had years to build this. Spent it adding dark patterns instead.
English
0
0
0
2
Mati
Mati@MatiBuildsWith·
Be honest: how many AI subscriptions is your team actually paying for vs actually using? Most enterprise AI spend is expensive window shopping. Tools nobody opened after the demo.
English
0
0
0
1
Mati
Mati@MatiBuildsWith·
@CShorten30 The underrated part of this: faster inference doesn't just help agents think faster — it changes which reasoning patterns are even worth attempting. Sub-second thinking unlocks iterative self-correction that was too expensive before.
English
0
0
0
1
Connor Shorten
Connor Shorten@CShorten30·
Agentic Reasoning is one of the most exciting emerging use cases for long-context embeddings. LLM inference keeps getting faster due to algorithmic advances and hardware such as Cerebras and Groq. As a result, it is now very common for Agents to output thinking tokens before their actions. ⚡️ LLM performance has been shown to scale with the number of thinking tokens used before answering. I highly recommend s1: Simple test-time scaling from @Muennighoff et al. for an excellent demonstration of this. 🔥 In most Agentic Search systems today, we throw that thinking away and embed only the search query the Agent ultimately produces. AgentIR from @zijian42chen et al. instead asks: what if we embed the reasoning alongside the query? AgentIR is a single-vector embedding model, Qwen3-Embedding-4B, fine-tuned on 5,238 training samples. These samples were synthesized via DR-Synth, an algorithm introduced in the AgentIR paper that mines gold documents for sub-queries in Agentic Search trajectories. 🏭 This model was a huge success, absolutely incredible gains over BM25 and Qwen3-Embedding-8B performance on the BrowseComp-Plus Agentic Search benchmark. 🚀 @antoine_chaffin and team are now asking: what if AgentIR was multi-vector? Multi-vector embeddings have more representational capacity than their single-vector predecessors, and that extra capacity seems to help the most when the input is long. Exactly what a reasoning trace plus query looks like! So I think the question is this: As reasoning traces get longer, can single-vector embeddings keep up? Or is this a use case where multi-vector retrieval becomes the default? ⚖️
Antoine Chaffin@antoine_chaffin

Reason-ModernColBERT nearly solved BrowseComp-Plus, smashing SOTA and outperforming models models 54× bigger Not bad for a 1 year old model not optimized for deep research What if we actually tried? Introducing Agent-ModernColBERT: adding another 10% on top with a 5 min training

English
2
14
54
5K
Mati
Mati@MatiBuildsWith·
@emollick Most orgs aren't actually run by their org chart. A superintelligent AI that can't navigate unwritten power structures and informal veto points is just a very expensive consultant with no political cover.
English
0
0
0
1
Ethan Mollick
Ethan Mollick@emollick·
Had an interesting exchange with roon of OpenAI last night over whether super intelligent AI would actually be able to navigate organizational challenges.
Ethan Mollick tweet mediaEthan Mollick tweet mediaEthan Mollick tweet media
Ethan Mollick@emollick

@tszzl I think it is a reasonable argument to say "curing cancer will be easier than replacing Accenture," but the general pitch from many at the AI labs has been "we are worried most white collar jobs will be replaced by 2035" which implies some belief that AI becomes self-adopting.

English
61
31
529
71.9K