eli(as)

19 posts

eli(as) banner
eli(as)

eli(as)

@earlierism

evals @notionhq

Katılım Ağustos 2024
53 Takip Edilen186 Takipçiler
eli(as)
eli(as)@earlierism·
@mrnasrinasir Totally! For this very reason, all of our evals come are within Notion, and we make a point of testing model reasoning in messy context settings.
English
0
0
1
52
Nasri Nasir | The SG Notion Guy
Nasri Nasir | The SG Notion Guy@mrnasrinasir·
@earlierism curious how these stack up inside tools like Notion where the context window gets messy. been testing different models on client databases and.. structured output quality varies way more than benchmarks suggest
English
1
0
0
74
eli(as)
eli(as)@earlierism·
We’ve often been asked about how we eval each new model at Notion, since we’re one of the few apps deploying both frontier lab models and leading open source models for general knowledge work. Read on for what we share with the model providers after we evaluate their models.
eli(as) tweet media
Notion@NotionHQ

GPT-5.5 is now in Notion 🫡

English
8
11
135
102.6K
eli(as)
eli(as)@earlierism·
@aNotioneer Good catch! That's indeed part of our larger list of metrics. Glad you appreciate the report. I'll try to share one tailored to CAs soon :D
English
0
0
1
53
aNotioneer
aNotioneer@aNotioneer·
@earlierism Would % of tool errors also be quite an important metric there or am I missing something? Either way, please keep sharing these results, they’re a great reference point when choosing models for custom agents!
English
1
0
1
874
eli(as)
eli(as)@earlierism·
@STUDI0BRITTANY Good question! It depends on where we'll ship the model. E.g. - on "Auto," price and latency come to the fore, - on Custom Agent, accuracy parity, - on Personal Agent, the ceiling of the model's intelligence, to understand what it unlocks for our user
English
0
0
0
39
BRITTANY
BRITTANY@studiobrittanyx·
@earlierism i’m curious as to which dimension you weight hardest internally before ship: accuracy parity, error floor, or unit economics. feels like the answer changes which provider wins the slot?
English
1
0
0
246
RaceJohnson
RaceJohnson@RaceJohnson·
@earlierism This is the level of detail I love - pls keep em coming
English
1
0
0
100
Sarah Sachs
Sarah Sachs@sarahmsachs·
On Notion's knowledge work benchmark, GPT 5.5 is 33% faster, uses half the tokens (so half the price), and scores slightly higher than Opus 4.7. @OpenAI has declared themselves the winners, this week, in the frontier knowledge work arena.
eli(as)@earlierism

We’ve often been asked about how we eval each new model at Notion, since we’re one of the few apps deploying both frontier lab models and leading open source models for general knowledge work. Read on for what we share with the model providers after we evaluate their models.

English
24
33
545
251.1K
eli(as)
eli(as)@earlierism·
Waiting for my agents to finish...
eli(as) tweet mediaeli(as) tweet media
English
0
0
3
171
P
P@pdev001·
Blender addon + Three.js loader = full level pipeline Levels reference shared asset library instead of embedding meshes. No duplication. Result: ~500KB level files. Automatic instancing. Zero manual physics setup. #threejs #gamedev #indiegame #webgpu
English
1
12
149
6.2K
eli(as) retweetledi
Notion Mail
Notion Mail@NotionMail·
Help us caption this comic! Reply with your best line 📝 ✨ ✍️: @_as_eli
Notion Mail tweet media
English
4
6
34
4.3K
eli(as) retweetledi
Notion Mail
Notion Mail@NotionMail·
From Michelangelo’s “Prometheus Devoured” to Botticelli’s “Birth of an Inbox” witness the renaissance of email. ✍️: @_as_eli
Notion Mail tweet media
English
2
5
51
13K
eli(as) retweetledi
Notion Mail
Notion Mail@NotionMail·
It’s not you, it’s your tools. ✍️: @_as_eli
Notion Mail tweet media
English
7
11
113
14.4K
eli(as) retweetledi
Notion Mail
Notion Mail@NotionMail·
An inbox Marie Kondo would approve of. ✍️: @_as_eli
Notion Mail tweet mediaNotion Mail tweet media
English
3
4
46
4.8K
eli(as) retweetledi
Notion
Notion@NotionHQ·
Notion AI and Clippy walk into a bar…
Notion tweet media
English
16
20
297
28.6K
eli(as) retweetledi
Notion
Notion@NotionHQ·
Welcome to the future of work, where AI does the drudgery, your cortisol levels plummet, and you find time for that mythical creature called a “lunch break.”⏳
Notion tweet media
English
5
10
61
16K
eli(as) retweetledi
Shoshana Berger
Shoshana Berger@shoshanaberger·
New fave way to announce a a product release: COMICS As of today, you can turn anything you track in @NotionHQ into a chart. Elias Rimer, a recent grad who landed on our AI team, turned the news into a strip💥
Shoshana Berger tweet media
English
1
10
37
7.7K