danialhasan

10.8K posts

danialhasan banner
danialhasan

danialhasan

@dhasandev

building agent armies @trysquadhq | manifest your destiny @_buildspace

Toronto, Canada Katılım Ekim 2020
938 Takip Edilen2.5K Takipçiler
danialhasan
danialhasan@dhasandev·
wifi at Scotiabank Theatre looking good today
danialhasan tweet media
English
1
0
3
164
Kelindi
Kelindi@_kelindi·
I'm in NYC for Toronto tech week and back in Toronto during NY tech week. fml
English
4
0
21
1.9K
Ben Vinegar
Ben Vinegar@bentlegen·
The case for having your agent write its own comments on diffs: - directs your attention to the good parts - comes from session context / not just a summary - lower commitment than code comments - can make em permanent if you want
Ben Vinegar tweet media
English
12
1
135
7.9K
0xSero
0xSero@0xSero·
I had a chat about context + agentic engineering with Eric the founder of Repoprompt and a member of the rate limited podcast. I've learned a lot from Eric over the last 6 months, he has a great understanding of how to best utilise AI agents. Enjoy
English
5
10
110
10.6K
danialhasan
danialhasan@dhasandev·
@cairox100v aussie people be like yall never had spider kangaroo coffee and it shows
English
0
0
0
82
Cairox100v
Cairox100v@cairox100v·
i feel bad for people from toronto man, you've never had a good coffee in your life, its crazy
English
5
1
11
2.2K
gfodor.id
gfodor.id@gfodor·
@yacineMTB it's interesting that this stuff feels like the first genuinely new skillset for software eng since this whole thing started. iteration times are long so not easy to get good at quickly. imo this is gonna the main skill gap that splits software engineering employability soon
English
3
0
7
159
gfodor.id
gfodor.id@gfodor·
Random list of tips (ymmv) from trying to get gpt 5.5 to build a big thing over a day or two: - Use chunkier milestones with acceptance criteria rooted in end-to-end integration testing, not bullshit fake harnesses - Have the root agent do the work with /goal and delegate to subagents to block on code review before every commit - Make sure it's clear the plan is *immutable*, keep plan status updates in a separate file, don't let the agent cheat by updating the plan - Let the agent add features and logging to the harness on demand to be able to test for acceptance, but ensure those also go through subagents for similar review to ensure they're necessary and not cheating - Each milestone should have a dedicated folder with captured artifacts and distinct runner scripts so it can be fully audited for agent fuckery - Part of acceptance criteria should be the end-to-end system is still possible to run interactively and be used by a user - Have a cadence of "cleanup milestones" which force the agent to delete logging, debug code, kill dead code, remove all feature flags that are no longer needed, remove any 'fallback' or 'legacy' handlers, kill pointless error handling, and break up files and DRY up common code. imo don't do this at every milestone, do it every couple of milestones - Make sure the plan includes detailed information up front of what the state shoudl be at the last milestone - Include 'anti-patterns' in the plan - things the agent should never do, such as update the plan, or build a new harness with mock objects, or anything else that you discover in past runs as a loophole to not get shit done - It's been helpful to have an interactive Opus session I can use as it runs to check the plan status and front run future milestones if things veer off track - instead of interrupting the primary agent, I have 4.7 tweak or insert milestones to course correct
English
11
11
222
10.7K
danialhasan
danialhasan@dhasandev·
@kenwuuuu langsmith is neither of those things but its great for the evals/observability you need to make your agents great
English
0
0
7
608
Ken Wu
Ken Wu@kenwuuuu·
so what is the current industry standard to build agents? do people still use langchain/langgraph?
English
123
23
950
250.3K
Kress Franzen
Kress Franzen@kressf·
@dhasandev @OpenAI Thanks for sharing this link, made it easy for me to dive deeper. Very cool moment, even if this isn't perfect - so much progress in the right direction.
English
1
0
1
47
OpenAI
OpenAI@OpenAI·
Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946. For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids. An OpenAI model has now disproved that belief, discovering an entirely new family of constructions that performs better. This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics.
English
1K
3.8K
26.4K
13.1M
K.
K.@NotOnKetamine·
Polsia raised $30M for "AI that runs your company." We did the diligence — on their own data: - Fake ARR (real ≈ $0) - Fake customers (94% dead) - A human-graded Claude wrapper, not autonomous - A god-mode kill-switch on every company you build 🧵👇
K. tweet media
English
71
52
1K
165.8K
danialhasan retweetledi
ramona
ramona@ramonable·
ladies of toronto - this ttw, we're hosting an ai design workshop where incredibly talented designers break down their workflow join us!
English
4
4
33
2.5K
danialhasan
danialhasan@dhasandev·
harness optimizers generate candidates for how a harness should work. these are tested against positive/negative datasets and evaluated; winners proceed, losers are eliminated. i had the idea to generate candidates using llms that receive the traces of past candidates, winners, failures, etc. which is exactly what GEPA is in a nutshell this last candidate did well here and bad over here, so lets regenerate it with contextually relevant changes. much better than blind generation
English
0
0
3
70
danialhasan
danialhasan@dhasandev·
i think i just reinvented GEPA from first principals
English
1
1
8
799
Ken Wu
Ken Wu@kenwuuuu·
whats the most fun events to do at toronto tech week. i need recommendations
English
7
0
13
2.1K
danialhasan retweetledi
Simon Last
Simon Last@simonlast·
1/ Some things I've learned recently running coding agents on large-scale projects. Most of this contradicts advice from 6 months ago!
English
92
211
3.1K
557.2K
Shreya Shankar
Shreya Shankar@sh_reya·
i'm restarting my blog! i want to kickstart productive conversations around: what should AI agents look like for hard, subjective knowledge work? a lot of agent setups work well when tasks are objective and easy to verify. but many workflows (e.g., qualitative analysis, strategy, sensemaking) are messy and interpretive. as a first post, i explore different ways of doing agent-assisted qualitative analysis on tweets, with varying levels of human feedback/intervention. tldr: they all kinda sucked. turns out it’s hard to: (a) stop agents from converging too quickly on shallow interpretations (b) get agents to adapt to preferences that emerge gradually across many turns (i.e., evolving context) (c) capture human judgment without making humans fatigued
Shreya Shankar tweet media
English
24
32
277
49.4K
danialhasan
danialhasan@dhasandev·
@cairox100v it’s like when someone drops their ice cream and you hear them go “oh naurrrrr” so you just know they’re from the land down under
English
1
0
7
425
Cairox100v
Cairox100v@cairox100v·
how the fuck can you tell someone is from toronto by how they sound
English
14
0
41
13.9K