에그

4.8K posts

에그

에그

@eggie5

انضم Haziran 2010
606 يتبع428 المتابعون
تغريدة مثبتة
에그
에그@eggie5·
Announcing 🕺DABstep! an Agentic Benchmark collab w/ @Adyen x @huggingface There are many blind spots in LLM evals especially wrt agents, namely: * saturation * real-world applicability * objective evaluation * complexity We make concrete contributions in these directions...
에그 tweet media
English
1
1
5
488
에그
에그@eggie5·
I did not wake up a loser this morning
English
1
0
1
58
에그 أُعيد تغريده
Andreu ⛩️
Andreu ⛩️@dru_blackberry·
On "Why would I pay for SaaS if I can vibe-code it?". Here's my 🌶️ take: 1) Headcount You don't need as many engineers as you had. You can do with less. But the reality is that you could already do with less even before AI. That's management honesty. 1/4
Andreu ⛩️ tweet media
English
1
1
0
78
에그
에그@eggie5·
@yaroslavvb thanks for sharing. ur comments around 39m made me think of Napoleon, where he wouldn't open letters until after many weeks or a follow up was sent! apply to emails, slack...
English
0
0
5
295
Yaroslav Bulatov
Yaroslav Bulatov@yaroslavvb·
The project could be understaffed because it's not that useful to the company bottom line. Spending a lot of effort on something that won't get appreciated puts you in danger of burn-out. Related, a talk on avoiding burn out -- youtube.com/watch?v=bh906h…
YouTube video
YouTube
English
1
5
198
35.2K
에그
에그@eggie5·
Fraud model idea: separate base-rate priors from transaction evidence, then add them in logit space. Test corr(prior, evidence) = 0.038, suggesting the evidence tower learned a distinct corrective signal rather than just amplifying priors.
에그 tweet media
English
0
0
0
46
에그
에그@eggie5·
Hot take: dog man is just the star wars story but w no universe and poop jokes
English
0
0
0
46
에그
에그@eggie5·
Where doing variant of this at adyen for shopper linking: dense retrieval for identity
English
0
0
0
59
에그
에그@eggie5·
1) do bm25 hard neg mining (DPR paper) 2) verify pos pairs w/ LLM (if possible) 3) full softmax for each pos pair (devil in details) at runtime you get amortized LLM inference (could we get most of the way w/ steps 1-2, sampled softmax and just larger batches??)
dr. jack morris@jxmnop

x.com/i/article/2031…

English
1
0
3
244
에그
에그@eggie5·
You don't see this detail much in practice, as my first tweet alludes, but I feel there was a comeback w/ RAG namely around the hierarchical chunking techniques... arxiv.org/abs/1905.06566
English
0
0
0
46
에그
에그@eggie5·
i'm thinking about this a lot lately as I'm working on tabular pretraining where there's a strong nat hierarchy: fields > payments > entities (eg shopper/merchant). HIBERT addresses this same problem in LLMs as they attend over _sentences_ in a doc: tokens > sentences > docs...
English
1
0
0
44
에그
에그@eggie5·
Interesting that LLM pretraining corpra originally came from documents like web pages, articles or books but the only structure learning sees is sentence boundaries. This gravely assumes this natural context is irrelevant _or_ that in practice its recovered at scale (emergent)...
에그 tweet media
English
1
0
0
69
에그
에그@eggie5·
didn't get the ICLR accept, but pretty proud of the reviews, given the rush job -- i wrote most of it during the sessions at neurips in December :)
English
1
0
6
205
에그
에그@eggie5·
the proverbial last (skewed) task!
에그 tweet media
English
0
0
0
48
에그
에그@eggie5·
I love the window into Headstone's mania and how Rayburn goads him into insanity
English
0
0
0
29
에그
에그@eggie5·
The Eugene Rayburn character is basically Simpole from Bleak House, but with the redemption arc...
English
1
0
0
46
에그
에그@eggie5·
Beyond the romance and satire, Our Mutual Friend is story about the inheritance of a 19th-century recycling startup. Dickens literally started the circular economy....
English
1
0
0
72
에그
에그@eggie5·
in theory nice, but in practice, the well proven Mode Collapse phenomenon makes this (increasingly) irrelevant, right?
Jawwwn@jawwwn_

Palantir CTO @ssankar on K-LLMs: Never use 1 LLM when you can use K-LLMs

English
0
0
0
98
에그
에그@eggie5·
@deadalnix When you use that voice mode, isn't it super neutered as far as capabilities???? it is in my experience...
English
0
0
0
23
deadalnix
deadalnix@deadalnix·
PhD grade intelligence.
Dansk
366
1.3K
11.3K
724.4K