Antaripa Saha

5.3K posts

Antaripa Saha banner
Antaripa Saha

Antaripa Saha

@doesdatmaksense

building @specstoryai | consulting companies in applied ai | doing maths in my free time

Beigetreten Kasım 2018
431 Folgt15.3K Follower
Antoine Chaffin
Antoine Chaffin@antoine_chaffin·
We've almost saturated BrowseComp-Plus with a 150M model... ... but this was an old model and I had a lot of ideas to improve the results 🙁 So maybe it's time to kick off a new challenge and see what's the cheapest setup we can solve BrowseComp-Plus with?
Antoine Chaffin tweet media
Antoine Chaffin@antoine_chaffin

BrowseComp-Plus, perhaps the hardest popular deep research task, is now solved at nearly 90%... ... and all it took was a 150M model ✨ Thrilled to announce that Reason-ModernColBERT did it again and outperform all models (including models 54× bigger) on all metrics

English
5
5
28
6.5K
Antaripa Saha retweetet
SpecStory
SpecStory@specstoryai·
Cursor, Codex and Claude Code are all single-player. Your whole team builds alone and no one knows what anyone else decided. But building product is a team sport. AI should be too. The conversations, decisions, specs and builds. All of it, together, with your whole team. Launching soon → somehow.sh
English
3
5
30
13.4K
Antaripa Saha
Antaripa Saha@doesdatmaksense·
@jobergum keeping aside the wordings, the problem statement mentioned in the post seemed real to me, although i am yet to test if what they are claiming hydradb solves is actually true. but do you have diff opinions on that, would love to hear
English
2
0
5
949
Jo Kristian Bergum
Jo Kristian Bergum@jobergum·
Why do people have write dumb stuff like this when announcing a raise on X?
Jo Kristian Bergum tweet media
English
34
3
106
16K
Mixedbread
Mixedbread@mixedbreadai·
Introducing Mixedbread Wholembed v3, our new SOTA retrieval model across all modalities and 100+ languages. Wholembed v3 brings best-in-class search to text, audio, images, PDFs, videos... You can now get the best retrieval performance on your data, no matter its format.
Mixedbread tweet media
English
35
121
945
184.3K
Antaripa Saha
Antaripa Saha@doesdatmaksense·
@contextkingceo the problem statement is very real and have faced such issues during multiple client projects, would love to test out hydradb. congratulations, Nishkarsh!
English
1
0
7
1.5K
Nishkarsh
Nishkarsh@contextkingceo·
We've raised $6.5M to kill vector databases. Every system today retrieves context the same way: vector search that stores everything as flat embeddings and returns whatever "feels" closest. Similar, sure. Relevant? Almost never. Embeddings can’t tell a Q3 renewal clause from a Q1 termination notice if the language is close enough. A friend of mine asked his AI about a contract last week, and it returned a detailed, perfectly crafted answer pulled from a completely different client’s file. Once you’re dealing with 10M+ documents, these mix-ups happen all the time. VectorDB accuracy goes to shit. We built @hydra_db for exactly this. HydraDB builds an ontology-first context graph over your data, maps relationships between entities, understands the 'why' behind documents, and tracks how information evolves over time. So when you ask about 'Apple,' it knows you mean the company you're serving as a customer. Not the fruit. Even when a vector DB's similarity score says 0.94. More below ⬇️
English
613
661
5.9K
3.8M
Siva Reddy
Siva Reddy@sivareddyg·
LLM2Vec-Gen represents a major paradigm shift for embeddings/retrieval. Why encode the query when the LLM already knows what to look for and can directly produce an embedding for it? Best part: it’s self-supervised, and it does all of this while the LLM remains completely frozen. Think about it: "solve x² + 3x − 4 = 0" has zero reasoning in it. But the LLM's response does. By encoding the response, the embedding captures the reasoning --- and the better the LLM reasons, the better the embedding. This is why our results scale with model size. As LLMs get smarter, our embeddings automatically get better. LLM2Vec-Gen is also the first demonstration of the promise of @ylecun's JEPA for text embeddings. The alignment loss is JEPA — predict in representation space, not token space. The reconstruction loss goes beyond --- it keeps embeddings decodable. This paradigm shift opens new frontiers: 🔬 Can we build a full JEPA for language where the teacher and student are the same LLM? ⚡ Can LLMs reason in compressed space without ever generating text? 🤖 Can agents reason in compression tokens and carry that directly into retrieval? 💬 Can agents talk to each other in compression tokens instead of text --- dense, fast, and still human-readable? LLM2Vec-Gen is a first step toward all four.
Siva Reddy tweet media
Vaibhav Adlakha@vaibhav_adlakha

Your LLM already knows the answer. Why is your embedding model still encoding the question? 🚨Introducing LLM2Vec-Gen: your frozen LLM generates the answer's embedding in a single forward pass — without ever generating the answer. Not only that, the frozen LLM can decode the embedding back into text. 🏆 SOTA self-supervised embeddings 🛡️ Free transfer of instruction-following, safety, and reasoning

English
7
27
172
21.4K
himanshu
himanshu@himanshustwts·
i guess its time to be @droid-pilled
English
3
0
21
13.3K
himanshu
himanshu@himanshustwts·
If you are Claud Code/Opus 4.6-pilled, this might sounds crazy to you but CC is worst harness for Opus 4.6 with accuracy of 58% Thank you for your attention to this matter.
himanshu tweet media
English
57
22
719
175.3K
Antaripa Saha
Antaripa Saha@doesdatmaksense·
i am kind of tired of all the new updates in ai, harness, coding agents, etc etc., and also my feed full of different articles (both good and bad ones) time to again start solving math problems to get some peace of mind from ai rush
English
0
0
15
609
Mritunjay Sharma
Mritunjay Sharma@mritunjay394·
@bcherny Till now I can manually create a markdown and ask Claude to share that markdown eventually but I think more intelligent and easier way can be useful?
English
3
0
6
4.5K
Boris Cherny
Boris Cherny@bcherny·
Released today: /loop /loop is a powerful new way to schedule recurring tasks, for up to 3 days at a time eg. “/loop babysit all my PRs. Auto-fix build issues and when comments come in, use a worktree agent to fix them” eg. “/loop every morning use the Slack MCP to give me a summary of top posts I was tagged in” Let us know what you think!
English
572
845
12.9K
2.1M
Ishan Dutta
Ishan Dutta@ishandutta0098·
hi bangalore, > i am planning to host - bangalore ai agent summit in april > no this isn’t just a startup showcase > i want individual builders to come and show how to build an agent which isn’t just another vibe coded project > i need builders, volunteers and sponsors to help me execute this - reply under this tweet and dm me > retweet for karma! —— $whoami > mle II at @Adobe | ex nvidia > running ai agent cohorts at outskill (backed by @sequoia) with @VaibhavSisinty - 10M+ learners - 160+ countries > built the most reliable open source python package to detect human ai deepfakes - mukh - 11k+ downloads on pip > former deep learning researcher at @Rephrase_AI (backed by @LightspeedIndia and acquired by @Adobe) - i built @waitin4agi_ ‘s first ai avatar clone at rephrase > competitive machine learning hackathons with top ranks on @kaggle, @AnalyticsVidhya , @JoinMachinehack > national rank 12 amongst 10k+ competitors in @amazon ml challenge and more…. —- this isn’t a meetup, its a statement of skill.
English
57
32
264
18K
Antaripa Saha
Antaripa Saha@doesdatmaksense·
in blr after almost a year. will be here for 4-5 days ;)
Antaripa Saha tweet media
English
2
0
20
1.1K
Pratim🥑
Pratim🥑@BhosalePratim·
Life update: Said yes to the most ambitious and promising role of my life. It’s time to grind.
English
45
2
408
25.7K
Antaripa Saha
Antaripa Saha@doesdatmaksense·
i have checked the windsurf code inside extract_windsurf.py, did you test it on windsurf conversations? we have tried to extract conversations in standard format for windsurf at @specstoryai, but windsurf uses a proprietary binary format and has it's own schema which is not public, also the data is primarily cloud-synced via Codeium servers. i see your current implementation is based on the assumption that winsurf will be similar to vscode given it was initially a fork, so probably the implementation also assumed like vscode (and cursor), it stores some editor state in SQLite databases. more details here: github.com/Exafunction/co… nonetheless very interested to see if you were able to extract the windsurf conversations successfully.
English
0
0
2
585
0xSero
0xSero@0xSero·
Lame, incredibly lame. Let me make your life better: 1. Go install this 2. Run it 3. Get all your Claude conversations outputted as jsonl (training data) Get a model to strip the ENVs or private info. Share the dataset. We can build it ourselves github.com/0xSero/ai-data…
Anthropic@AnthropicAI

We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax. These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.

English
40
89
1.4K
162.1K
Antaripa Saha
Antaripa Saha@doesdatmaksense·
@A_K_Nain you should definitely start posting on linkedin. your annotated papers are considered goldmine, people should know that!
English
1
0
3
401
Aakash Kumar Nain
Aakash Kumar Nain@A_K_Nain·
Talking to a group of people last week, and I realized how ill informed the newcomers in ML are. Also, figured that if I were as active on LinkedIn as I am on X and Github, people would have known who they are talking to.
English
2
0
24
2.3K
Antaripa Saha
Antaripa Saha@doesdatmaksense·
@himanshustwts all the best himanshu, lots of good opportunities on the way!
English
1
0
4
1.4K
himanshu
himanshu@himanshustwts·
Some update: I have decided to step down from my roles at Upsurge Labs. It was quite a ride building products and creating a mindshare. Nothing but love to my former team and incredible people i worked with!
English
97
4
600
48.8K