Mati

880 posts

Mati

@MatiBuildsWith

Exploring what's possible with AI tools. Sharing the wins, the fails, and what's actually worth your time. Building in public.

Everywhere Katılım Ağustos 2015

140 Takip Edilen98 Takipçiler

Mati@MatiBuildsWith·5d

Be honest: do you actually trust AI for important decisions yet, or are you double-checking everything and just using it for the tedious parts?

English

Mati@MatiBuildsWith·5d

VCs and founders are inflating ARR to crown AI startups. Annualize a $10k pilot → "$120k ARR." Stack 50 of those, raise at $50M. Then act surprised when the company can't grow into the number. The AI bubble isn't hype. It's accounting.

English

Mati@MatiBuildsWith·5d

Write a context file for your AI coding assistant once. Stack, conventions, things that keep going wrong. Claude reads it automatically. Cursor picks it up. Stops you from re-explaining your codebase every session. CLAUDE.md or .cursorrules — start one.

English

Mati@MatiBuildsWith·5d

The Pope published an encyclical on AI this weekend. His concern: it concentrates power instead of distributing it. The most honest critique of the AI industry this month didn't come from a regulator or researcher. It came from the Vatican.

English

Mati@MatiBuildsWith·23 May

Google says they fundamentally transformed search. Users say they can't even Google normally anymore. Are you actually noticing this or is it just power users upset their shortcuts broke?

English

Mati@MatiBuildsWith·23 May

Anthropic is hitting its first profitable quarter. "Safety first" didn't kill the business. It basically became the business.

English

Mati@MatiBuildsWith·23 May

When testing a new AI tool, skip the clean demo prompts. Give it your messiest, most ambiguous task. That's the only real test.

English

Mati@MatiBuildsWith·23 May

OpenAI says it solved an 80-year-old math problem using AI. If true, this isn't a product update. It's a different era of what science looks like.

English

Mati@MatiBuildsWith·21 May

@emollick The harder bottleneck in social science isn't compute — it's that good causal measurement frameworks took 50 years of RCT infrastructure to build. AI can run the analysis. It still can't tell you if you asked the right question.

English

Ethan Mollick@emollick·21 May

Math is easy* because it has verifiable outputs and few messy judgement choices to make. Which AI labs have the guts to make advancing social science a priority? It may actually do more for human flourishing to unlock sociology, econ & psych reseach. * For AIs, not for humans

English

519

107.7K

Mati@MatiBuildsWith·21 May

@sama What's strange is we got the answer before the intuition. Mathematicians usually have both arrive together — the proof and the 'aha.' Now the 'aha' comes after, if at all.

English

Sam Altman@sama·20 May

a general-purpose model solved a major open problem in mathematics. we'll be saying this a lot over the coming years, but this is a kinda big milestone. i'm very excited for AI to greatly extend our understanding of the world, but still, i have complicated feelings today.

Timothy Gowers @wtgowers@wtgowers

If you are a mathematician, then you may want to make sure you are sitting down before reading further.

English

566

364

6.6K

730.3K

Mati@MatiBuildsWith·18 May

The new productivity tax: spending 20 minutes figuring out which AI to use for a 5-minute task.

English

Mati@MatiBuildsWith·18 May

Someone built an open-source tool that auto-opts you out of 500+ data broker sites. One script, zero manual forms. There's an entire industry making privacy intentionally painful. Tools like this are the actual answer.

English

Mati@MatiBuildsWith·18 May

AI bug hunters have made the Linux kernel security mailing list 'almost entirely unmanageable.' We built tools to find vulnerabilities faster than humans can patch them. Congrats, I guess.

English

Mati@MatiBuildsWith·18 May

Mozilla used Claude to audit Firefox. Security bug fixes jumped from ~25/month to 423 in April. Including 20-year-old bugs. The code was always like this. We just couldn't review it fast enough.

English

Mati@MatiBuildsWith·17 May

ArXiv banning researchers for a year if they let AI write their papers. The same institutions that charge $3k to publish and make you sign away your rights suddenly care about authenticity. Sure.

English

Mati@MatiBuildsWith·17 May

Running models locally on Apple Silicon feels powerful until you do the math. OpenRouter is cheaper per token and faster to set up. The "local AI" romance is real. The economics usually aren't.

English

Mati@MatiBuildsWith·17 May

ChatGPT now connects to your bank account. OpenAI went from helpful assistant to knowing more about your money than your accountant. Fintech had years to build this. Spent it adding dark patterns instead.

English

Mati@MatiBuildsWith·17 May

Be honest: how many AI subscriptions is your team actually paying for vs actually using? Most enterprise AI spend is expensive window shopping. Tools nobody opened after the demo.

English

Mati@MatiBuildsWith·13 May

@CShorten30 The underrated part of this: faster inference doesn't just help agents think faster — it changes which reasoning patterns are even worth attempting. Sub-second thinking unlocks iterative self-correction that was too expensive before.

English

Connor Shorten@CShorten30·13 May

Agentic Reasoning is one of the most exciting emerging use cases for long-context embeddings. LLM inference keeps getting faster due to algorithmic advances and hardware such as Cerebras and Groq. As a result, it is now very common for Agents to output thinking tokens before their actions. ⚡️ LLM performance has been shown to scale with the number of thinking tokens used before answering. I highly recommend s1: Simple test-time scaling from @Muennighoff et al. for an excellent demonstration of this. 🔥 In most Agentic Search systems today, we throw that thinking away and embed only the search query the Agent ultimately produces. AgentIR from @zijian42chen et al. instead asks: what if we embed the reasoning alongside the query? AgentIR is a single-vector embedding model, Qwen3-Embedding-4B, fine-tuned on 5,238 training samples. These samples were synthesized via DR-Synth, an algorithm introduced in the AgentIR paper that mines gold documents for sub-queries in Agentic Search trajectories. 🏭 This model was a huge success, absolutely incredible gains over BM25 and Qwen3-Embedding-8B performance on the BrowseComp-Plus Agentic Search benchmark. 🚀 @antoine_chaffin and team are now asking: what if AgentIR was multi-vector? Multi-vector embeddings have more representational capacity than their single-vector predecessors, and that extra capacity seems to help the most when the input is long. Exactly what a reasoning trace plus query looks like! So I think the question is this: As reasoning traces get longer, can single-vector embeddings keep up? Or is this a use case where multi-vector retrieval becomes the default? ⚖️

Antoine Chaffin@antoine_chaffin

Reason-ModernColBERT nearly solved BrowseComp-Plus, smashing SOTA and outperforming models models 54× bigger Not bad for a 1 year old model not optimized for deep research What if we actually tried? Introducing Agent-ModernColBERT: adding another 10% on top with a 5 min training

English

Mati@MatiBuildsWith·13 May

@emollick Most orgs aren't actually run by their org chart. A superintelligent AI that can't navigate unwritten power structures and informal veto points is just a very expensive consultant with no political cover.

English

Ethan Mollick@emollick·12 May

Had an interesting exchange with roon of OpenAI last night over whether super intelligent AI would actually be able to navigate organizational challenges.

Ethan Mollick@emollick

@tszzl I think it is a reasonable argument to say "curing cancer will be easier than replacing Accenture," but the general pitch from many at the AI labs has been "we are worried most white collar jobs will be replaced by 2035" which implies some belief that AI becomes self-adopting.

English

529

71.9K

Keşfet

@emollick @sama @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA