Riya

54 posts

Riya banner
Riya

Riya

@riyajaiinn

Katılım Eylül 2020
27 Takip Edilen36 Takipçiler
Viv
Viv@Vtrivedy10·
@riyajaiinn I’m making you try this later
English
1
0
0
56
Riya
Riya@riyajaiinn·
Good stuff!
Viv@Vtrivedy10

had a blast on the pod (first one kinda nervous 😅), big shoutout to @himanshustwts just felt like us riffing on agent engineering, open source & research ❤️ some fun highlights - working backwards from the model’s capabilities/flaws and building systems (a harness) around them to accomplish Tasks - Traces! they’re our signal for continual learning and self-improving agents. They capture agent errors + inefficiencies, pointing compute at understanding traces helps us update our agents and generate evals and training environments so so the modem doesn’t make these mistakes in the future. This becomes a moat for many teams - it’s ok to engineer a good harness with opinionated built-in skills, prompts, and execution patterns like task decomposition. you just want to solve your task with the intelligence we have today, go do that and adjust as new models are released - open harnesses and open models are an on-ramp for teams to own their intelligence. Traces are an improvement signal, we need to use compute to understand them at scale - I think I need to update my priors on how quickly RLMs and computer use went from interesting to pretty usable and more research looks like it’s coming, really happy to see and a note to retry the tools in these spaces more often! - AI is super broad, we’re all figuring it out, doing stuff that interests you and telling ppl about it is a great way to grow yourself and meet other great ppl builders all of Himanshu’s pod episodes are awesome everyone should check out :)

English
0
1
1
993
Riya
Riya@riyajaiinn·
Quality stuff 💯
Viv@Vtrivedy10

Harness, Memory, Context Fragments, & the Bitter Lesson this is a work in progress mental dump on interesting intersections between how we use and design a harness, implications for memory being accumulated over long timescales, and the search bitter lesson we can’t escape this is v30+, HTML diagrams help me iteratively refine + chat to roughly “see” and alter the mental model Harnesses & Context Fragments: a very important job of the harness is to efficiently & correctly route data within its boundaries into the context window boundary for computation to happen the context window is a precious artifact. Harnesses make decisions on how to populate, manage, edit, and organize it so agents can do work. Each loaded object can be thought of as a Context Fragment and represents an explicit decision by the user and harness designer of what needs a model needs to do work at any given time. many ideas on externalizing objects + loading into the context window are pioneered and very well described by @a1zhang with RLMs Experiential Memory: we’re in the very early days of deploying agents and agents produce massive amounts of data in every interaction they have. this is akin to humans doing things and remembering things they did. however agent memory has a massive advantage as it can be accumulated across all agents which are easily forked and duplicated (unlike humans). @dwarkesh_sp does a good talking about this massive benefit of artificial systems memory can be treated as an externalized object. the harness is tasked with doing good contextualized retrieval which means pulling in the right data from accumulated memories across all agent interactions Search & The Bitter Lesson: As we deploy agents in our world over year timescales, there is going to be a hyper-exponential in the amount of data produced by those agents. We should want to: 1. Own that data for ourselves. Open ecosystems are important here 2. Use that data This means that we’ll have to search over, distill, and organize massive amounts of data. Our brain is exceptional at doing this. Both contextually using prior experience and mostly committing the right stuff to memory with enough intentional practice. Our current infrastructure systems and algorithms will be put to the test and often break as we get used to this new data regime some open questions: - how do we efficiently distill experiences (Traces) into higher level memory primitives that capture the important parts? How do we do this over ultra long time horizons? - How much of the future is Search just-in-time vs Search that gets integrated into model weights? - How do we make models much better at self-managing their context window? How do we reduce error rates in recursively allowing agents to operate over external objects? i’ll be expanding on, altering, and adjusting these mental models but these feel like an important subset to me on the future of designing agents practically

English
0
1
13
4.1K
Riya
Riya@riyajaiinn·
@Vtrivedy10 Sure please put some time on my calendar
English
1
0
1
20
Viv
Viv@Vtrivedy10·
@riyajaiinn shall we discuss the details of this breakthrough later?
English
1
0
0
38
Riya
Riya@riyajaiinn·
@Vtrivedy10 Dont forget about the jersey mikes
English
1
0
2
140
Viv
Viv@Vtrivedy10·
in a much needed break from AI yapping I sat down with a big drink, burrito, and popcorn and watched Dhurandhar2 for 4 hours straight first time in an empty theatre, idk why lol but highly recommend 😅 was fully engrossed the whole time, banger movie, def gonna win a ton of awards, Ranveer Singh you goat 🔥🐐 feel like ppl appreciate movies more after covid tldr: movies are great man :)
English
4
0
38
2.2K
Viv
Viv@Vtrivedy10·
entire feed is ai X Londonmaxxing I’m so for it, London’s awesome ❤️ great vibe/energy + something for any type of person/interest over there
Viv tweet mediaViv tweet mediaViv tweet mediaViv tweet media
English
3
1
36
1.6K
Riya retweetledi
Viv
Viv@Vtrivedy10·
I started writing about Harness Engineering ~5-6 months ago here's a blog on the actual recipes we use at LangChain to improve our Agents+Harnesses and get a Top5 score on Terminal Bench 2.0 some highlights: - Self-verification is a fast ramp for agents autonomously improving themselves - We use Traces extensively to mine errors and improve the harness - Context Engineering on behalf of your agents reduces error rates via a good "agent onboarding experiencing" Would love to hear thoughts! We'll be publishing more open research and worklogs like this in the coming months Happy hill climbing :)
Viv@Vtrivedy10

x.com/i/article/2022…

English
5
12
147
24.9K
Riya retweetledi
Viv
Viv@Vtrivedy10·
Download+Run agents in seconds with...curl??⚡️ Yup...Deep Agents are just folders so we can - package them up - download them - run them with just a couple commands & the deepagents-cli here I download an agent from our deepagents examples that comes with Skills for blogs/social media writing And in a minute, we get a full blog with our custom voice on the future of agentic coding 🤖 shoutout to @JNYBGR and remotion, edited this intro in minutes with their skill!!
English
9
6
40
10.7K
Riya retweetledi
Viv
Viv@Vtrivedy10·
This weekend’s side quest…🩸Stranger Code 🥀 Powered by @langchain deepagents ⚠️ semi-spoiler alert…skip last 10 seconds of vid if you don’t wanna think about the ending ⚠️ It’s a full Stranger Things themed coding agent TUI where you can: - Pair program with the Vecna agent or get a code review from Barb - Communicate with the upside down with some fun theming - Just code normally with Opus-4.5…it’s a coding agent! There’s some fun Easter Eggs dropped in if you’re looking hard :) This was built on deepagents. It’s fully open, ready for you to hack. Looking forward to seeing what ppl do, check out the repo linked below some may say it's even "bi**hin"
English
5
5
18
2.5K