Prithivi Da

2.4K posts

Prithivi Da banner
Prithivi Da

Prithivi Da

@prithivida

Founding CTO @ MeltPlan, 50M+ downloads in 🤗, Cited in NeurIPS, IEEE/CVF, ACL

Bangkok, Thailand Inscrit le Eylül 2011
475 Abonnements1.1K Abonnés
Arjun Jain | Fast Code AI
No, PM Is Not a GAN. Stop! @SchmidhuberAI's Predictability Minimization (1992) and Ian Goodfellow's GANs (2014) both use adversarial objectives. So does every zero-sum game since von Neumann. That's where the similarity ends. Goodfellow's generator never sees real data. It maps noise to samples and learns the data distribution purely through the discriminator's gradients. That's the whole trick. That's the invention. Schmidhuber's PM does the opposite - both players sit on top of the same real data, competing to learn independent features. It's representation learning. Nothing is generated. No noise is mapped anywhere. No distribution is learned. Calling PM a GAN because both use minimax is like calling chess a war because both have strategy. PM was a smart idea about feature independence. GANs were a breakthrough in implicit generative modeling. These are not the same insight, and retroactively collapsing the distance between them doesn't honor prior work - it misrepresents both.
English
4
8
119
19.8K
Prithivi Da
Prithivi Da@prithivida·
How good is the surfer 2 agent with Hcompany models ? Anyone ?
English
0
0
0
83
Prithivi Da retweeté
Bo Wang
Bo Wang@BoWang87·
Postdoc vs first-year PhD in the lab
English
1
4
83
23.3K
Prithivi Da
Prithivi Da@prithivida·
Prithivi Da@prithivida

Agents: a plausible projection, and 5 vectors to consider before adopting. A claim making the rounds is that Claude agents will end up in the graveyard, alongside the OpenAI Agents SDK and Google’s earlier efforts. That dismisses all the nuances. 1. The first distinction to keep straight is this: agent stack ≠ agent runtime ≠ agents themselves. Many people collapse these into one category, and that creates confusion. 2. No single agent stack will fit every need. Consumer grade agents need to behave like appliances with minimal config fuss. Enterprise grade agents need governance, controls, and integration depth. Hacker-grade systems such as OpenClaw optimize for flexibility and experimentation. Agent stacks are analogous to programming languages or model families: broad substrates, not universal answers. 3. For consumer-grade agents to pickup adoption, most of the below must be true, (in other words why hacker grade agents won’t magically become consumer grade ?) Separate agent builders from agent users. Over time, the market likely needs something closer to an agent store: a place where free and paid agents can be discovered, distributed, and monetized. Like an app store, that ecosystem would work best when paired with both an SDK and a runtime. 4. Agents, in practice, are bounded systems. Most high value agents cannot unbounded accumulation of skills. That’s where directions like holaboss as a workspace centric stack make sense. 5. From all that angle, Claude agents are an opinionated, MCP-heavy stack. OpenClaw is closer to a runtime philosophy that treats agents as extensible collections of skills. The distinction matters. MCP is powerful, but also polarizing. It can be verbose, token-expensive, and unpopular with some developers. At the same time, teams that have already invested in MCP endpoints may find the Claude ecosystem attractive, especially if the surrounding tool layer makes orchestration easier. So, Claude agents may not go to graveyard as most predict it to be, if anything they have revived MCP with managed agents. if you want an agent system with bespoke guardrails, recovery logic, safety controls, persona management, and trace-driven learning as first-class features, you will probably have to build it yourself.

QAM
0
0
0
224
Jo Kristian Bergum
Jo Kristian Bergum@jobergum·
It’s either just fancy scaffolding around the model or everyone should build their own harness, which way ai engineer?
English
11
0
15
4.1K
Prithivi Da
Prithivi Da@prithivida·
Agents: a plausible projection, and 5 vectors to consider before adopting. A claim making the rounds is that Claude agents will end up in the graveyard, alongside the OpenAI Agents SDK and Google’s earlier efforts. That dismisses all the nuances. 1. The first distinction to keep straight is this: agent stack ≠ agent runtime ≠ agents themselves. Many people collapse these into one category, and that creates confusion. 2. No single agent stack will fit every need. Consumer grade agents need to behave like appliances with minimal config fuss. Enterprise grade agents need governance, controls, and integration depth. Hacker-grade systems such as OpenClaw optimize for flexibility and experimentation. Agent stacks are analogous to programming languages or model families: broad substrates, not universal answers. 3. For consumer-grade agents to pickup adoption, most of the below must be true, (in other words why hacker grade agents won’t magically become consumer grade ?) Separate agent builders from agent users. Over time, the market likely needs something closer to an agent store: a place where free and paid agents can be discovered, distributed, and monetized. Like an app store, that ecosystem would work best when paired with both an SDK and a runtime. 4. Agents, in practice, are bounded systems. Most high value agents cannot unbounded accumulation of skills. That’s where directions like holaboss as a workspace centric stack make sense. 5. From all that angle, Claude agents are an opinionated, MCP-heavy stack. OpenClaw is closer to a runtime philosophy that treats agents as extensible collections of skills. The distinction matters. MCP is powerful, but also polarizing. It can be verbose, token-expensive, and unpopular with some developers. At the same time, teams that have already invested in MCP endpoints may find the Claude ecosystem attractive, especially if the surrounding tool layer makes orchestration easier. So, Claude agents may not go to graveyard as most predict it to be, if anything they have revived MCP with managed agents. if you want an agent system with bespoke guardrails, recovery logic, safety controls, persona management, and trace-driven learning as first-class features, you will probably have to build it yourself.
Prithivi Da tweet media
English
0
0
0
356
Prithivi Da retweeté
Carlos E. Perez
Carlos E. Perez@IntuitMachine·
On Claude Mythos 0:00 - We haven't trained it specifically to be good at cyber, we trained it to be good at code. But as a side effect of being good at code, it's also good at cyber. 0:20 - The model that we're experimenting with is by and large as good as a professional human at identifying bugs. 0:34 - this model is able to create exploits out of three, four, sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. 0:51 - Obviously, capabilities in a model like this could do harm if in the wrong hands, and so we won't be releasing this model widely. 2:02 - I've found more bugs in the last couple of weeks than I've found in the rest of my life combined. 3:07 - We've spoken to officials across the US government and we've offered to work with them and collaborate to assess the risks of these models and to help defend against the risks of these models. 3:37 - It is essential that we come together and work together across industry to help build better defensive capabilities. No single organization sees the whole picture and can tackle this on their own.
English
43
151
1.3K
215.1K
Bo
Bo@bo_wangbo·
Im honestly confused, is gte-modern-colbert even multilingual? isn't jina-colbert-v2 or colbert-xm a better target to compare with? liquid.ai/blog/lfm2-colb…
Bo tweet media
English
1
0
9
854
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
@bo_wangbo It isn’t indeed That being said, LFM-ColBERT is trained on English only data, the multilingual performances comes from generalisation thanks to the multîlingual backbone Also FWIW we should release a multilingual ColBERT soon-ish
English
3
0
10
491
Chroma
Chroma@trychroma·
Chroma supports multiple lexical search strategies for keyword-style retrieval. FTS. BM25. SPLADE. Walks through how they differ, and when each one wins.
English
4
6
61
5.9K
Niels Rogge
Niels Rogge@NielsRogge·
Damn, really cool talk by @badlogicgames appeared on my YouTube feed! Lots of alpha regarding building agent harnesses, and why Anthropic cut off access to @opencode and the like
Niels Rogge tweet media
English
3
9
96
5.1K
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
It actually did a bit, less than it should imho (because this is probably one of my favourite piece of work), but I actually saw a lot of usage and people very happy about it! Unfortunately during the release back then went less viral than usual on Twitter which did not help, but it seems to still start to be widely used, especially the edge models!
English
1
0
1
59
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
I want to deeply thank everyone that attended It is absolutely insane how many people there was and also how cracked and friendly everyone was On a more personal note, I was so happy to see all of the work that were enabled by all of our efforts (ModernBERT, Ettin, PyLate), it is the whole reason we are doing this so I got emotional, this is the reason we are doing this so thank you so much
Antoine Chaffin@antoine_chaffin

There is actually so many insane people attending the workshop I would not even dare starting the list It's going to be super cool!!

English
5
5
48
4K
Ben (no treats)
Ben (no treats)@andersonbcdefg·
the weirdest part of getting old is that one day you wake up and all the people you know from college are suddenly famous
English
3
0
37
2K
Prithivi Da
Prithivi Da@prithivida·
@lateinteraction @TheSeriousProg Correct, they are but part of the ecosystem and they have the users, so at some level we depend on them unless we band up together and create a new optimised DB for multi-vec support no ?
English
1
0
0
65
Omar Khattab
Omar Khattab@lateinteraction·
@TheSeriousProg Ah many of these companies offer bad services because they don’t want to use the custom stack that multi-vector models need. They try to reuse their single-vector stack. Horrible decision and indeed it leads to extreme costs. But it’s just self-inflicted.
English
1
0
0
73
Omar Khattab
Omar Khattab@lateinteraction·
overwhelming evidence for late interaction / multi-vector models yet again :-) > even after finetuning, single-vector models lag far behind multi-vector embeddings, which achieve significant performance gains and exhibit greater robustness to catastrophic forgetting.
Sumit@_reachsumit

On Strengths and Limitations of Single-Vector Embeddings Microsoft shows that dimensionality alone cannot explain poor retrieval performance of single-vector embeddings, identifying domain shift and the "drowning in documents" paradox as key factors. 📝 arxiv.org/abs/2603.29519

English
4
7
89
8.4K
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
It’s funny because it’s highlighting a few key points of our discussions lately It’s not *only* about the dimension and yes, it’s always very much possible to fine tune your model to become better at the task… But having to do it for every task is a pain, especially if your model forget the others every time
English
2
0
3
525
Prithivi Da
Prithivi Da@prithivida·
I shit you not, someone said to me “this is 100% organic code and 5 hours of coding”.
English
0
0
0
127