Dan Cleary (@DanJCleary) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Dan Cleary@DanJCleary·4d

"When we (@intercom) started this pursuit, I honestly didn't think it was going to be possible to be better than frontier models. I thought maybe we can get parity on average. But can we actually beat them?" They did. @PedroTabacof, principal ML scientist at Intercom on how Apex replaced Claude as the core model for Fin, across 2 million customer service conversations a week. Full episode out now. Link below. Amazing work going on over at Intercom @eoghan @fergal_reid Which AI app company is next to launch their own model? @harvey ? @Replit or @Lovable ?

English

1

0

2

183

Dan Cleary@DanJCleary·1d

Opus 4.7 vs GPT 5.5 in their respective harnesses Opus with the W

English

0

43

Dan Cleary@DanJCleary·1d

more here: danjcleary.substack.com/p/are-there-an…

English

0

13

Dan Cleary@DanJCleary·1d

Moats never existed, and ChatGPT's memory getting pwnd with a single prompt is just another example of this (and a killer move from Anthropic, matched by Google a few weeks after) More thoughts on moats (or the lack of) in the age of AI below

English

2

0

1

48

Dan Cleary@DanJCleary·2d

Convalytics(.dev) is up and running. Easy web + product analytics for @convex apps. Your agent can install with a single prompt and set up custom events for you

English

1

0

3

128

Dan Cleary@DanJCleary·4d

youtu.be/nOQY10X26Vc

YouTube

ZXX

0

25

Dan Cleary@DanJCleary·4d

"When we (@intercom) started this pursuit, I honestly didn't think it was going to be possible to be better than frontier models. I thought maybe we can get parity on average. But can we actually beat them?" They did. @PedroTabacof, principal ML scientist at Intercom on how Apex replaced Claude as the core model for Fin, across 2 million customer service conversations a week. Full episode out now. Link below. Amazing work going on over at Intercom @eoghan @fergal_reid Which AI app company is next to launch their own model? @harvey ? @Replit or @Lovable ?

English

1

0

2

183

Dan Cleary@DanJCleary·5d

Data: slop-bench.vercel.app Code: github.com/Dan-Cleary/slo…

Italiano

0

15

Dan Cleary@DanJCleary·5d

SlopBench: Opus 4.7 did worse (more slop) then Opus 4.6 Code + data on Github and below

English

1

0

59

Dan Cleary@DanJCleary·5d

Full vid: youtu.be/sajfGkwc4D4

YouTube

English

0

29

Dan Cleary@DanJCleary·5d

Vibe coding breakdown: Opus 4.7 vs GPT 5.4 vs Gemini 3.1 TL;DR GPT 5.4 Opus 4.7 Sonnet 4.6 Gemini 3.1

Indonesia

2

0

1

67

Dan Cleary@DanJCleary·17 Nis

Opus 4.7 fav word is malware, idk what is in there but every single edit or file read seems to have to be checked for malware first

English

0

61

Dan Cleary@DanJCleary·16 Nis

Convalytics Convex component is live 🚀

English

0

1

23

Dan Cleary@DanJCleary·16 Nis

Opus 4.7 benchmarks sans Mythos

English

0

27

Dan Cleary@DanJCleary·16 Nis

Been a consumer of @convex components for awhile, stocked to be publishing my first for convalytics.dev

English

0

3

31

Dan Cleary@DanJCleary·16 Nis

Youtube:youtube.com/watch?v=nOQY10…

YouTube

English

0

40

Dan Cleary@DanJCleary·16 Nis

A year ago I was skeptical that companies could train their own models to beat frontier models. @eoghan and the team at Intercom proved me wrong. And this conversation explained exactly how they did it. @PedroTabacof is a Principal ML Scientist at Intercom working on their custom model Apex. -It beats GPT-5.4. -It beats Opus 4.5. -It hallucinates 65% less than Sonnet. And 100% of Fin's traffic now runs on it. We got into: -How they actually post-train a model from scratch -Why evals are 90% of the work -The state of open source models If you care about AI products, vertical models, or where this is all going , this is a must watch. Link below.

English

1

0

1

55

Dan Cleary@DanJCleary·10 Nis

Did a whole deep dive here: danjcleary.substack.com/p/is-vertical-…

English

0

18

Dan Cleary@DanJCleary·10 Nis

This didn't get enough attention Intercom built a model that outperforms Opus + GPT 5.4. "As of last week, ~100% of all (English language, chat and email) customer conversations are now running on Apex." Vertical AI is just getting started

Eoghan McCabe@eoghan

x.com/i/article/2036…

English

1

0

3

221

Dan Cleary@DanJCleary·10 Nis

@stopachka Love this direction for backends, sounds Convex, so did a quick test for gang x.com/DanJCleary/sta…

Dan Cleary@DanJCleary

for those looking for a quick and dirty @instant_db vs @convex comparison

English