Yajan

101 posts

Yajan banner
Yajan

Yajan

@yajan0

19 | AI + physics

Katılım Ekim 2020
72 Takip Edilen13 Takipçiler
Yajan
Yajan@yajan0·
@paulg people who actually know what the best is tend to ask quietly and very specifically
English
0
0
0
943
Paul Graham
Paul Graham@paulg·
It's a rough combination when people are simultaneously overreaching and uninformed. They want the best of everything, but they don't know what the best is, so their demands are simultaneously strident and random, like a set of vectors with large magnitudes and random directions.
English
144
103
1.8K
106.6K
Yajan
Yajan@yajan0·
AI agents are wild gave one 3 hours of research work it finished in 20 minutes i spent the next 40 minutes verifying it didn't hallucinate so basically i have a very fast intern who confidently lies sometimes and i can't fire them
English
0
0
0
33
Yash
Yash@yashhq_22·
Hot take: claude is still better than codex at coding if you just excuse the token limits.
English
150
3
284
23.5K
Yajan
Yajan@yajan0·
entropy explains AI hallucinations better than "it's a training problem" ever will. the physics: entropy = systems drift toward maximum disorder unless something actively fights it every token a model generates pushes the probability distribution one step further from the signal 🟢 first 50 tokens: model is anchored to reality 🟢 first 200 tokens: mostly coherent 🔴 beyond that: you are watching entropy compound in real time hallucinations are not a data problem they are not an alignment problem they are the second law of thermodynamics researchers keep throwing RLHF and bigger datasets at it you cannot fine-tune your way out of a law of the universe
Yajan tweet media
English
0
0
0
34
Yajan
Yajan@yajan0·
@tibo_maker so the actual strategy is just... be real and be good at what you do? wild how that keeps being the answer?
English
0
0
0
58
Tibo
Tibo@tibo_maker·
the researchers building LLMs don't fully understand how their own models work but sure, your 4-step GEO framework has it all figured out 🤣 everyone on the internet has a theory. do FAQ sections. get more citations. add reviews. make your content comprehensive - the list goes on but nobody can show you results actually attributed to any of it because attribution doesn't exist yet. and it won't until the people building these models can explain why they surface what they surface 🤷🏻 so what do you actually do with that? stop chasing GEO hacks ❌ your best bet is to do such a good job from day one that the internet builds a data bank about you real content pieces, real presence, real mentions - stuff LLMs will naturally pull from and make sure your SEO foundation is solid and that's why we and 2,500+ websites are so bullish on Outrank 🚀
Tibo tweet media
English
10
0
35
3.9K
Arena.ai
Arena.ai@arena·
GPT-5.5 by @OpenAI is now live in the Arena, landing across multiple leaderboards. Here’s how it ranks by modality: - Code Arena (agentic web dev): #9, a strong +50pt jump over GPT-5.4 - Document Arena (analysis & long-content reasoning): #6, on par with Sonnet 4.6 - Text Arena: #7, Math #3, Instruction Following: #8 - Expert Arena: #5 - Search Arena: #2 - Vision Arena: #5 Strong, well-rounded performance, especially in Code (+50 pts vs GPT-5.4). Congrats to @OpenAI on the release. Full category breakdowns by modality in the thread.
Arena.ai tweet media
OpenAI@OpenAI

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

English
349
131
1.9K
1.4M
Hridoy Rehman
Hridoy Rehman@hridoyreh·
GoDaddy can take your domain:
Hridoy Rehman tweet media
English
66
5
186
56.6K
Kalshi
Kalshi@Kalshi·
BREAKING: S&P 500 hits new all-time high
English
97
158
2.1K
154.6K
Silicon Mania
Silicon Mania@siliconmania·
last week in tech was based.
English
152
458
5.2K
1.2M
Yajan
Yajan@yajan0·
@deepseek_ai openai and anthropic charging 100x more for roughly the same benchmarks. at what point do we stop pretending closed source is worth the premium?
English
0
0
1
2.9K
DeepSeek
DeepSeek@deepseek_ai·
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice. Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today! 📄 Tech Report: huggingface.co/deepseek-ai/De… 🤗 Open Weights: huggingface.co/collections/de… 1/n
DeepSeek tweet media
English
1.6K
7.7K
45.3K
9.7M
Yajan
Yajan@yajan0·
@sama Is it agi?
Indonesia
0
0
0
10
Sam Altman
Sam Altman@sama·
GPT-5.5 is here! We hope it's useful to you. I personally like it.
English
1.6K
970
19.7K
1.7M
Yajan
Yajan@yajan0·
@thsottiaux the fact that twitter outrage is now the fastest way to get a feature back says everything about how product decisions get made in 2026
English
0
0
0
1.9K
Tibo
Tibo@thsottiaux·
I don't know what they are doing over there, but Codex will continue to be available both in the FREE and PLUS ($20) plans. We have the compute and efficient models to support it. For important changes, we will engage with the community well ahead of making them. Transparency and trust are two principles we will not break, even if it means momentarily earning less. A reminder that you vote with your subscription for the values you want to see in this world.
Amol Avasare@TheAmolAvasare

For clarity, we're running a small test on ~2% of new prosumer signups. Existing Pro and Max subscribers aren't affected.

English
581
679
11.8K
1.7M
Yajan
Yajan@yajan0·
anthropic yanked claude code from the pro plan and users lost it. so they brought it back. this is the real product feedback loop now. ship, get yelled at on twitter, revert. agile development in 2026.
English
1
0
0
70
Kalshi
Kalshi@Kalshi·
BREAKING: Iran officially reopens Strait of Hormuz
English
714
3.4K
22.2K
3.8M
Claude
Claude@claudeai·
Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.
Claude tweet media
English
4.8K
10.2K
81.1K
13.9M
unusual_whales
unusual_whales@unusual_whales·
NEWS: Anthropic is preparing for the release of its new Opus 4.7 model, per the Information
English
221
164
5.2K
400.2K