techwarq

370 posts

techwarq banner
techwarq

techwarq

@TeenCode6

i build products that make things easier

Bengaluru, Karnataka Katılım Aralık 2019
219 Takip Edilen62 Takipçiler
Ariii Biazzo G
Ariii Biazzo G@AriiiGenua·
@TeenCode6 @rauchg @17 I'm working on something no equal, but similar in some way. We could have a google meet and mix ideas. Thank You! Have a nice day:)
English
1
0
1
26
Guillermo Rauch
Guillermo Rauch@rauchg·
Show me the thing you’ve built with AI you’re most proud of. Reply with a working product URL and what model / agent you primarily used.
English
2K
153
3K
524.9K
Alp Sariyer
Alp Sariyer@alpsariyer·
@TeenCode6 @rauchg Well I guess this is no brainer to have "no account needed" in the cta button label that goes ahead and navigates to registration page.
English
1
0
0
20
Priyansh Agarwal
Priyansh Agarwal@Priyansh_31Dec·
Ask me anything. Will try to reply to all.
English
168
2
206
41.2K
techwarq
techwarq@TeenCode6·
@thrawn01 @rauchg yeah, i’ve actually been working on this problem inside Snips using ideas from research papers to solve real engineering/product challenges, not just basic AI retrieval. do you also mean finding the best research/materials based on the actual problem statement?
English
1
0
1
24
Derrick Wippler
Derrick Wippler@thrawn01·
@TeenCode6 @rauchg I can typically ask an AI to do this work for me, but what I really would love to have is some sort of agentic search which matches my current problem with several different research papers related to the problem I'm solving.
English
1
0
0
36
Ariii Biazzo G
Ariii Biazzo G@AriiiGenua·
@TeenCode6 @rauchg It looks great! Let me give some feedback: 1. In mu mind your home colors look like a fitness app 2. Also the typo 3. I think that I understood the concept vert fast. For example, 1 paper understood easily with flash cards. --- I would change the color, the typo and design
English
1
0
0
98
Akshay 🚀
Akshay 🚀@akshay_pachaar·
code as agent harness. a 102-page survey from Stanford, Meta, and UIUC on agent harnesses. the paper argues that code is no longer just the thing agents produce. it’s the medium through which they reason, act, and represent their environment. it calls this “code as agent harness” and covers three layers: code as the interface between agents and their tasks; the mechanisms that keep agents reliable over long-horizon execution (planning, memory, tool use, verification); and how multi-agent systems coordinate through shared code artifacts. core findings: the paper introduces “evolution agents” that treat the harness itself as the optimization target. they collect telemetry, diagnose failures, propose infrastructure changes, and promote only mutations that pass regression. the harness improves itself. in multi-agent systems, topology complexity inversely correlates with infrastructure quality. teams with better shared state use simpler coordination. teams without it build increasingly elaborate workarounds. finally, the paper concludes that future agent systems need four properties: - executable - inspectable - stateful - governed read more: arxiv.org/abs/2605.18747 i also published this deep dive (article) on agent harness engineering, covering the orchestration loop, tools, memory, context management, and everything else that transforms a stateless LLM into a capable agent. the article is quoted below.
Akshay 🚀 tweet media
Akshay 🚀@akshay_pachaar

x.com/i/article/2040…

English
40
110
747
120.1K
techwarq
techwarq@TeenCode6·
to continue just go to usesnips.com and understand the whole paper
English
0
0
0
6
techwarq
techwarq@TeenCode6·
here's how snips made understand the whole paper
techwarq tweet media
English
1
0
1
18
Grigory Sapunov
Grigory Sapunov@che_shr_cat·
1/ RLHF practitioners are wasting budget. If you treat dynamics and rewards as a monolithic world model, your data allocation is wrong. Reward models learn ~9x faster than dynamics simulators. 🧵
Grigory Sapunov tweet media
English
4
26
180
13.2K
λux
λux@novasarc01·
i think this is an interesting direction: combining OPSD-style privileged self-distillation with multi-agent systems...from what i understood SDAR (self-distilled agentic reinforcement learning) is a hybrid method that combines GRPO’s sparse but grounded trajectory-level RL signal with OPSD’s dense but noisy token-level privileged guidance. its key move is to keep RL untouched and add a gated self-distillation loss. imo the useful idea is not distill everything from the teacher but to treat privileged agents, tools, skills or expert branches as conditional sources of local supervision that the policy can selectively trust...use RL to decide what works globally and use self-distillation only where auxiliary agents provide reliable local evidence. but to me the current formulation in the paper still feels like a single-privileged-branch method (not yet a full general recipe for agentic learning). the teacher is basically the same policy with retrieved skills so the quality of the distillation signal depends heavily on whether the privileged context actually contains useful information. also i think the scaling path is to generalize SDAR from one privileged teacher branch to many specialized privileged branches + combining with async RL.
λux tweet mediaλux tweet mediaλux tweet mediaλux tweet media
English
3
20
115
15.6K
Rohan Paul
Rohan Paul@rohanpaul_ai·
Is Grep All You Need? The surprising result is not that grep is powerful, but that agent design makes it powerful. The paper says not that grep beats vectors, but that agents fail or win through their harness. That sounds like a small distinction until you look at what was actually tested. The authors compare grep-style search and vector retrieval across LongMemEval tasks, where agents must recover facts from long conversation histories full of distractors. Inline grep beats inline vector across every harness-model pair in their main experiment, sometimes by wide margins. The tempting headline is that vector databases are overbuilt for coding agents. The better reading is sharper: when the answer is anchored in literal evidence, names, dates, file paths, function names, error strings, user preferences, grep gives the model a clean mechanical advantage. Embeddings are built to tolerate paraphrase, but tolerance has a cost. They can pull in semantically nearby clutter, especially when a short agent query is vague. Grep has the opposite failure mode. It is dumb, cheap, and narrow, but when the agent knows the right string to hunt for, dumb becomes a feature. The deeper finding is that retrieval is not a component you can benchmark in isolation. The same search method behaves differently depending on whether results are injected inline, written to files, routed through a CLI, or wrapped in a custom agent loop. So the question is not “Do we still need vector databases?” The question is whether your agent is solving a semantic discovery problem or an evidence-location problem. For coding agents, a surprising amount of work is evidence-location: find the symbol, trace the call, inspect the diff, read the failing test, recover the exact line. Vectors still matter at scale and for fuzzy conceptual search, but this paper weakens the lazy default that every serious agent stack begins with embeddings. Sometimes the upgrade is not a smarter index. Sometimes it is giving the model primitive tools, clean files, disciplined context, and a harness that lets exact search do exact work. ---- Paper Link – arxiv. org/abs/2605.15184 Paper Title: "Is Grep All You Need? How Agent Harnesses Reshape Agentic Search"
Rohan Paul tweet media
English
18
38
242
49.1K
Rohan Paul
Rohan Paul@rohanpaul_ai·
New Google paper: A forecast needs context, not just history. Some patterns are caused by events, not time. Nexus reframes forecasting as a reasoning problem, where events and numbers have to explain each other. Nexus argues that forecasting improves when models read the world around the numbers, not just the numbers themselves. In the Zillow tests, one Claude-based version cut average MAPE by 86.6% versus direct chain-of-thought prompting. That matters because most time series models are fluent in pattern, but mute about cause. A housing inventory curve can reflect seasonality, mortgage pressure, migration, layoffs, and local supply, while a stock price can be bent by earnings, regulation, hype, and fear. Nexus separates those jobs instead of asking one prompt to do everything. One agent turns messy historical text into a clean event timeline, one reads the broad regime, another tracks local shocks, and a synthesizer reconciles them with calibration from past errors. The interesting result is not merely that context helps, but that structure helps the language model use context without losing the time series. The evidence is still narrow: Zillow counts, seven equities, post-cutoff data, and single-run evaluations, so this is not a universal law of forecasting. But the direction is clear: future forecasters will not only extrapolate curves; they will argue about what made the curve move. ---- Paper Link – arxiv. org/abs/2605.14389 Paper Title: "Nexus : An Agentic Framework for Time Series Forecasting"
Rohan Paul tweet media
English
22
85
487
61.1K