Himanshu Tyagi

2.3K posts

Himanshu Tyagi banner
Himanshu Tyagi

Himanshu Tyagi

@hstyagi

Building open-source AGI @SentientAGI

Katılım Ocak 2015
585 Takip Edilen291.7K Takipçiler
Himanshu Tyagi retweetledi
Oleg Golev
Oleg Golev@oleg_golev·
This is precisely why I'm excited about sentient.xyz/arena. The goal is to crowdsource as many different solutions as possible for the hardest AI reasoning challenges. The solutions space is so vast nowadays that we have to pursue large volume and evolutionary algorithms to help us explore in parallel
Andrej Karpathy@karpathy

The next step for autoresearch is that it has to be asynchronously massively collaborative for agents (think: SETI@home style). The goal is not to emulate a single PhD student, it's to emulate a research community of them. Current code synchronously grows a single thread of commits in a particular research direction. But the original repo is more of a seed, from which could sprout commits contributed by agents on all kinds of different research directions or for different compute platforms. Git(Hub) is *almost* but not really suited for this. It has a softly built in assumption of one "master" branch, which temporarily forks off into PRs just to merge back a bit later. I tried to prototype something super lightweight that could have a flavor of this, e.g. just a Discussion, written by my agent as a summary of its overnight run: github.com/karpathy/autor… Alternatively, a PR has the benefit of exact commits: github.com/karpathy/autor… but you'd never want to actually merge it... You'd just want to "adopt" and accumulate branches of commits. But even in this lightweight way, you could ask your agent to first read the Discussions/PRs using GitHub CLI for inspiration, and after its research is done, contribute a little "paper" of findings back. I'm not actually exactly sure what this should look like, but it's a big idea that is more general than just the autoresearch repo specifically. Agents can in principle easily juggle and collaborate on thousands of commits across arbitrary branch structures. Existing abstractions will accumulate stress as intelligence, attention and tenacity cease to be bottlenecks.

English
3
5
46
4.7K
Himanshu Tyagi retweetledi
Sentient
Sentient@SentientAGI·
Applications are now live! Cohort 0 starts March 13th in Presidio with OpenHands, OpenRouter, alphaXiv, Fireworks, Dedalus Labs, Franklin Templeton, Founders Fund and Pantera. → $25K+ in prizes → 3 weeks building state-of-the-art AI agents → Many more surprises Apply below 👇
English
126
66
512
124.9K
Neil Tripathi
Neil Tripathi@tripathi_neil·
Just submitted my first paper to arXiv, and I found something that fits the growing conversation around newer models hedging their bets more and more. VB: Visibility Benchmark - checks if vision-language models can apply common-sense reasoning to determine what's actually visible in a photo. Joint work with Ernest Davis at NYU. 9 models tested: GPT-5, GPT-4o, Gemini 3.1 Pro, Gemini 2.5 Pro, Claude Opus 4.5, Claude 3.7 Sonnet, Gemma 3 12B, InternVL3-8B, and Qwen3-VL-8B. 100 image families, 300 evaluation cells each.
Neil Tripathi tweet media
English
15
23
113
11.6K
Himanshu Tyagi retweetledi
Sentient
Sentient@SentientAGI·
Today we are launching the next phase of AI reasoning development with Founders Fund, Franklin Templeton, Pantera Capital, Fireworks AI, OpenRouter, OpenHands, Dedalus Labs, alphaXiv, and more. AI is advancing at a relentless pace, but there are many reasoning capabilities we have yet to discover. Announcing Arena—an evaluation-driven platform for ideation, prototyping, and high-quality data generation—with top AI developers advancing SOTA performance on real-world enterprise reasoning tasks.
English
109
82
446
263.1K
Neil Tripathi
Neil Tripathi@tripathi_neil·
Had a great conversation with Professor Charles Elkan the other day about AI agents. One thing he said that stuck with me: the argument that “models will just get better and absorb everything” is actually a paradox. If we take that to its logical conclusion, there’s no point building anything on top of models, whether that’s orchestration, agents, or tooling. But obviously that’s not true. There’s real value in the systems we build around models, not just the models themselves.
English
2
3
8
855
Himanshu Tyagi retweetledi
Sentient
Sentient@SentientAGI·
Quick and nostalgic look of our work in 2025. See you all in 2026: the year of open-source reasoning.
English
232
91
676
85.9K
Himanshu Tyagi retweetledi
Oleg Golev
Oleg Golev@oleg_golev·
Building a general-purpose AI agent with only open-source models is hard. Making it consistent, reliable, and fast enough for production usage is even harder. We at @SentientAGI have been optimizing both👇 Today we’re revealing SERA (Semantic Embeddings & Reasoning Agent): the AI architecture behind SERA-Crypto, our state-of-the-art agent for token research, DeFi analysis, and on-chain reasoning, combining 50+ APIs into market insights. 👉 #1 open-source agent on DMind, ahead of Perplexity Finance & Gemini, within ~2% of GPT-5 Medium on Web3 reasoning 👉 #1 on our live crypto benchmark (198 real user queries across 11 categories), beating GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance More in 🧵
Sentient@SentientAGI

Announcing SERA-Crypto (Semantic Embedding & Reasoning Agent): our new reasoning architecture built for SOTA crypto research. #1 open-source agent on DMind #1 on our live crypto benchmark Outperforms GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance…all under 45 seconds.

English
79
13
165
8.3K
Himanshu Tyagi
Himanshu Tyagi@hstyagi·
When you want fast reasoning, good old semantic similarity is not bad. Use it to setup your prompts dynamically, all the way to the right tool call. This is what we use for our live crypto knowledge agent which integrates search and about 10 different structured data APIs.
Sentient@SentientAGI

Announcing SERA-Crypto (Semantic Embedding & Reasoning Agent): our new reasoning architecture built for SOTA crypto research. #1 open-source agent on DMind #1 on our live crypto benchmark Outperforms GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance…all under 45 seconds.

English
39
2
109
4.1K
Himanshu Tyagi
Himanshu Tyagi@hstyagi·
@bdguan It is such a beautiful subject. Only friends and parents have the patience to indulge in it. Schools are busy teaching.
English
0
0
2
506
brian
brian@bdguan·
all my life i've been told that i'm naturally gifted at math because i'm chinese. straight A student. math major at ucla. but here's what people don't know: when i was in 8th grade, i got a B+ in geometry. my dad said "that's unacceptable", bought a geometry textbook, and proceeded to assign me daily problems for 6 months. then i got good at geometry. i wasn't born gifted at math(ok maybe a little), i just grew up in an environment where being good at math was a requirement. this book is filled with extra math problems my dad assigned me. and i hated him for it. it took me 10+ years to realize how thankful i am that he pushed me like that. how that was simply his love language.
brian tweet mediabrian tweet media
English
127
397
6.8K
1.5M
Himanshu Tyagi
Himanshu Tyagi@hstyagi·
@deedydas Yann was pretty famous in 2010 :D And yes Soumith is a legend!
English
0
0
4
6.4K
Deedy
Deedy@deedydas·
If you feel like giving up, you must read this never-before-shared story of the creator of PyTorch and ex-VP at Meta, Soumith Chintala. > from hyderabad public school, but bad at math > goes to a "tier 2" college in India, VIT in Vellore > rejected from all 12 universities for US masters despite 1420 on the GRE > fuckit.jpg > goes to the US anyway on a J-1 visa to CMU with no plan > applies for masters (again) to 15 universities > rejected from all except USC and with late admissions, NYU in 2010 > finds this guy called Yann LeCun (before he was famous) > starts getting into open source > rejected from all jobs including DeepMind > only job is Amazon as test engineer > his PhD mentor helps him get a job at a small startup (MuseAmi) > rejected from DeepMind > couldn't get H-1B because of J-1 home return issue; gets waiver through months of approval with USCIS and US State Dept > very low on confidence > In 2011/12 builds one of the fastest AI inference engines on phones > rejected from DeepMind > emailed Yann again and joins FAIR because of Torch7 open-source work > scrapes through bootcamp at Facebook, struggling on an HBase task > L8/L9 engineers at Facebook struggle to get ImageNet working > figures out numerics / hyperparam issue as an L4 > first big win! > FAIR goes well, runs 3 person torch7 team and co-creates PyTorch > because of politics, management wants to shut down PyTorch > cries-at-bar.jpg, literally > eventually some people save PyTorch and it launches in 2017 > gets a EB-1 green card! > the rest is history... Think about that. He went to a tier 2 college. Was rejected from all Masters programs 2x. Rejected from every single job except Amazon test engineering. Rejected from DeepMind 3x. Nearly had his baby project shut down. Struggled with visa issues. After 12 years of failures (2005-17), he eventually rose to became a VP at Meta one of the most influential people in AI! Soumith's story is one of resilience and he's living proof that no matter how down in the dumps you are, there's always hope.
Deedy tweet media
English
285
1.3K
11.2K
2M
Himanshu Tyagi
Himanshu Tyagi@hstyagi·
If diffusion models drive all creative arts, we will learn that humans are not more creative than a kettle dissipating heat to boil water. A bit sad...
English
240
11
225
8.2K
Himanshu Tyagi
Himanshu Tyagi@hstyagi·
@abeirami It is a blessing and a burden! You keep on wishing that heuristics driven from beautiful beautiful geometric insights give the best algorithms :)
English
0
0
1
373
Ahmad Beirami
Ahmad Beirami@abeirami·
Once you see a math concept geometrically, it becomes much easier to think about, and it’s hard to go back to any other way of seeing it.
English
24
26
359
17.1K
Himanshu Tyagi
Himanshu Tyagi@hstyagi·
ROMA is a very simple and versatile architecture that recursively breaks complex queries into simpler ones. This method of coordinating multiple agents/tools/models is apt for deep research, long horizon tasks and boosting the power of models. This is emerging as an important primitive for multiagent reasoning systems across industries. This new version of the repo is more builder friendly and comes with prompt optimizer capabilities of DSPy. You can build a lot of stuff on it!
Salah Alzu'bi@salahalzubi401

[1/8] 🧵 🚀 ROMA (Recursive Open Meta Agents) v0.2.0 is here! Many exciting features have been added to streamline research/production threads: for better reliability and a builder-friendly ecosystem for high-performance recursive multi-agent systems. Stay tuned for the upcoming paper with some exciting results!We've completely rebuilt our framework using@DSPyOSS In this thread: the motivation and technical details behind ROMA, exciting research directions we're exploring, and our vision for recursive agents going forward github.com/sentient-agi/R…

English
256
22
396
42.5K
Himanshu Tyagi retweetledi
Sentient
Sentient@SentientAGI·
We’re excited to announce that @NeurIPSConf—the biggest AI conference in the world—has accepted 4 of our papers across various categories. Some might even call it “full-stack excellence” 😁 Here’s a sneak peek at our work that’s been recognized for their breakthroughs: ➡️ OML 1.0 (Main Track): scalable LLM fingerprinting—a hundredfold improvement on legacy fingerprinting attempts for open models, injecting 24,576 persistent prints while the previous max was ~100 fingerprints…without any drop in model performance. ➡️ LiveCodeBenchPro (Data & Benchmark Track): our customized benchmark focusing on programming ability, illustrating the true capabilities of models’ coding performance. On this benchmark, we were able to create models 10x smaller, using 20% of the data, to achieve comparable results to competing models. ➡️ MindGames Arena (Competition Track): selected by NeurIPS to run an AI competition for agents to improve themselves through social games. The next paradigm of AI improvement comes through self-optimization, and we’re extremely excited to be hosting this first-of-its-kind competition to create self-improving AI. ➡️ OML (Workshops & Tutorials—Lock-LLMs): our work established the challenge and solution around model security: a primitive that lets builders develop open models with verifiable, cryptographically enforced control under white-box access. Stay tuned for deep-dive threads throughout the week!
English
949
309
1.8K
615.7K
Sentient
Sentient@SentientAGI·
Meet @sandeepnailwal Sentient's Co-Founder and professional hot dog eater (allegedly) Drop a 🌭 if you want to see him prove he can eat 5 hot dogs in under a minute.
English
909
254
1.6K
138.7K
Andy
Andy@andyyy·
Who is the best marketer in all of crypto??? Looking for some inspiration on this fine Monday evening...
English
69
3
115
20.2K
Himanshu Tyagi
Himanshu Tyagi@hstyagi·
@natolambert We use it for building ReAct agents. Good for planning and a good balanced model
English
0
0
0
198
Nathan Lambert
Nathan Lambert@natolambert·
Who's using GPT-OSS and for what? Was it cheaper, better, faster than other open models? Or just not from China? Download numbers are actually very strong on HuggingFace for first model releases.
English
124
64
633
125.3K