Yash

1.4K posts

Yash banner
Yash

Yash

@yash1_

24, I read Research papers and obsess over implementation, @linuxfoundation '23, @Code4GovTech '23, DM for opportunities.

Remote Katılım Mayıs 2021
198 Takip Edilen95 Takipçiler
Avi Chawla
Avi Chawla@_avichawla·
As an AI Engineer shipping agents to production, please learn: - Not every intent needs an agent - Early stopping over indefinite retries - Fallback parsers for structured output - Evals for agent behavior not just output - Delivery infra that's framework-agnostic - Provider diversity as a reliability decision - Model portfolios over single-model stacks - One agent with good tools over multi-agent - Cost attribution per feature, not per invoice - Full-chain tracing, not just endpoint logging - Deterministic signals before LLM-as-a-judge - Production traffic repeats. Cache accordingly - Guardrails as middleware, not per-agent code - Human-in-the-loop is a design pattern, not a fallback Most of what blocks agents from going to production isn't the core logic. It's the plumbing around it. That's why most points above focus on agent ops, not just agent dev. Plano is a 100% open-source infrastructure layer that handles routing, orchestration, guardrails, and observability for agentic apps. I have shared the GitHub repo in the replies. 👉 Over to you: What else would you add here?
English
19
8
63
9.1K
🃏
🃏@sbincx·
If you’re so smart why aren’t you working on something that makes you feel like a moron on a daily basis?
English
25
11
95
3.7K
Yash
Yash@yash1_·
@zhyncs42 Nice, added to the reading list
English
0
0
0
109
zhyncs
zhyncs@zhyncs42·
Correctness is critical for LLM inference engines. Recently, I found TRT-LLM’s work on Hypothesis Testing Methodology to be extremely professional. #hypothesis-testing-methodology" target="_blank" rel="nofollow noopener">github.com/NVIDIA/TensorR…
zhyncs tweet media
English
2
4
65
2.2K
Jino Rohit
Jino Rohit@jino_rohit·
reading the awq quant paper today. also realized theres a tons of research work you can still do in quantization once you have a decent grasp of the general direction of the space
Jino Rohit tweet media
English
5
4
72
1.6K
Yash
Yash@yash1_·
@brookeleblanc What were their intentions to test or to humiliate if the other person didn't know something about?
English
0
0
0
5
Brooke LeBlanc
Brooke LeBlanc@brookeleblanc·
When I was switching roles late last year I interviewed for a company that wanted me to put everyone I knew in a spreadsheet And I immediately stopped process w/ them. My career/life, and all the people in it, is so much bigger than a spreadsheet. For a nonpartner, non cofounder role, never ever do this. Unless you have a lifechanging amount of skin in the game and your Rolodex will be completely confidential. Even then. Probably don’t do this. Who you know is your IP.
English
20
4
196
14.5K
Aritra 🤗
Aritra 🤗@ariG23498·
That is what I want to cover in the first part. I am about to complete my write up, and submit it to my colleagues for a review. Hope they like it! 🤞
Aritra 🤗 tweet media
English
3
0
14
479
Yash
Yash@yash1_·
@elonmusk How much % of the grok users use it for difficult coding tasks ?
English
0
0
0
295
Elon Musk
Elon Musk@elonmusk·
Grok foundation model V9-Medium (1.5T) has finished training. Evals look good. A lot of Cursor data was added in supplementary training and there is more to come. Fine-tuning is underway and reinforcement learning begins in a few days. 2 to 3 weeks to public release. This will be a major improvement over the 0.5T v8-small that currently serves all Grok production traffic, especially for difficult coding tasks.
English
2.4K
1.9K
17.4K
2.3M
Yash
Yash@yash1_·
@glcst Got the point but Not so great an analogy ig cause the one refuses to adopt ai would really be paranoid when the models become much powerful and fast.
English
0
0
0
21
Yash
Yash@yash1_·
@ziqi_huang_ It's very useful for the robotics I believe
English
0
0
1
16
Yash
Yash@yash1_·
@eliebakouch The performance degradation is so much that it's better to use search
English
0
0
0
21
elie
elie@eliebakouch·
asked 2 questions about the claude desktop app, defaults to haiku 4.5, both wrong answers :(
elie tweet media
English
5
0
13
1.5K
Yash
Yash@yash1_·
@teslaownersSV Pre-optimization is always bad. It's better to build a bad version of something and then optimize it as @elonmusk said.
English
0
0
0
57
Yash
Yash@yash1_·
@docmilanfar Curious to know what you said to the barista then ? xD
English
0
0
2
1.5K
Peyman Milanfar
Peyman Milanfar@docmilanfar·
can confirm. Google had no category for visiting faculty back then, so we all got (green badge) "interns" status. a barista once asked me if I was too old for an internship 😅
Andrej Karpathy@karpathy

@yash1_ @shreyansj iirc Geoff Hinton’s official title at Google at one point was “intern” :D

English
8
30
1.5K
151.5K
Yash
Yash@yash1_·
@luke_metro Significantly very less, not on priority definitely cause the one which is on priority can help solve that problem possible which is a powerful AI
English
0
0
0
259
Luke Metro
Luke Metro@luke_metro·
sometimes I wonder how many TPUs Google Deepmind is currently siccing on solving P vs NP
English
11
3
176
12.2K
Yash
Yash@yash1_·
@s_batzoglou Surely one of the important ones till date !
English
0
0
1
29
Yash
Yash@yash1_·
@leafs_s It's really a great paper, I read it the day after it was published.
English
0
0
0
23
CLaE
CLaE@leafs_s·
From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence This paper argues that classical information measures such as entropy and Kolmogorov complexity are insufficient for understanding modern AI learning. Entropy mainly measures randomness, while Kolmogorov complexity measures the shortest possible description of data. However, neither fully explains why some datasets are far more useful for training AI systems than others. To address this, the authors introduce epiplexity, a new concept intended to measure the amount of learnable structured information available to a computationally limited learner. The key idea is that information is not absolute. Its value depends on: the learner’s computational limits, the structure and ordering of the data, and the transformations used to generate or present the data. The paper suggests that epiplexity could provide a theoretical framework for: understanding why certain training data are more useful, designing better datasets, improving data generation and augmentation, and studying learning efficiency in AI systems. arxiv.org/abs/2601.03220
English
6
30
103
5.9K
Yash
Yash@yash1_·
@_aidan_clark_ 3B-4B just for the research compute is enough and almost every big lab has it, I would prefer all of them and also @ssi xD
English
1
0
1
2.6K
Aidan Clark
Aidan Clark@_aidan_clark_·
If you want to work on pretraining-for-AGI, join OpenAI, Google, Meta or the Anthropic/XAI/Cursor supergroup. The bitter truth of the widening compute gap is that all the problems which are actually on the critical path to AGI now demand that level of compute.
English
38
16
777
184.9K
arya
arya@AJakkli·
I really liked Eric's take on why alpha go is profound: A 10-layer network can only do 10 sequential steps of thinking, by construction. And yet those 10 steps can "amortize and approximate to very high fidelity a nearly intractable search problem."
Dwarkesh Patel@dwarkesh_sp

Monte Carlo Tree Search training corrects the model move by move, while current LLM training only tells it whether the whole trajectory worked. MCTS is preferable if you can get it. But nobody's managed to get MCTS to work for language models. In his blackboard lecture @ericjang11 talked to me about why:

English
4
19
376
58.7K