Ari

50 posts

Ari banner
Ari

Ari

@arealka

Katılım Mayıs 2020
240 Takip Edilen8 Takipçiler
Ari
Ari@arealka·
@levie “Financial engineers”, “logistics engineers”, “legal engineers”
Català
0
0
0
29
Aaron Levie
Aaron Levie@levie·
Right now there’s a temporary mismatch between the jobs that used to be sought after in some fields and the new jobs that are becoming in demand in those fields. For instance, if you studied CS, for years the general direction of travel was often to join a tech company and build customer-facing software in some form. A significant portion of the CS pipeline from college to hire was built for this. When you realize that AI is going to make coding abundant, you realize everyone will need technical talent to implement agentic systems. This means the types of roles engineers should be thinking about radically expands. I was talking to a Fortune 500 pharma CEO a week ago that commented on how much more technical talent they need right now. The job may be different from what it was 5 years ago when thinking about tech, but the demand for the skills are still there. And this is what I’m hearing from every CIO and CEO across nearly every industry right now. We definitely need colleges to wake up to this; but we equally need companies think about how they craft pipelines into these jobs.
Peter H. Diamandis, MD@PeterDiamandis

If AI now accounts for 25% of corporate layoffs, but 275,000 'AI jobs' are open, what's the real problem? It's not that AI is killing jobs. It's that we're training people for careers that expired five years ago. The education system is the bottleneck—not the technology. Fix that, and abundance follows.

English
74
57
562
114.7K
Ari
Ari@arealka·
@copyconstruct Still working out how to run e2e tests locally when working on multiple worktrees. The slightest flakiness throws the agent into a spiral  :(
English
0
0
0
288
Cindy Sridharan
Cindy Sridharan@copyconstruct·
end-to-end testing > unit tests, in the vibecoding era. A massive, almost entirely agent-coded refactor passed all unit and pre-merge tests but broke a critical feature. It was only caught due to my own excessive paranoia making me run end-to-end tests before the prod deploy.
English
29
29
381
125K
Ari
Ari@arealka·
@mattpocockuk I still find it interesting that you dont use browser validation. The agent is coding with its "eyes" closed. Why have you not included this as part of the workflow?
English
0
0
0
8
Matt Pocock
Matt Pocock@mattpocockuk·
Tons of folks are piling in here saying that AFK agents are a myth. I have been using them to ship these GitHub repos: mattpocock/evalite mattpocock/sandcastle mattpocock/software-factory (might be public by the time you see this) Here are a few steps to making this work, and some reality checks. Definitions Let's split this into the day shift and the night shift. Day shift is planning/review/QA, night shift is AFK implementation. Day Shift (part 1) 1. Use /grill-me to align with the AI 2. Use /to-prd and /to-issues to create a PRD (the destination) and implementation steps as separate tickets, which can be grabbed in parallel (the journey) 3. The PRD is a ticket, but it's not an actionable step. You just put the user stories there This is pure requirements gathering shit, same as it ever was. Night Shift 1. I run a planner agent which looks at all the tickets and sees what can be worked on now, and what's blocked 2. The planner agent then kicks off multiple agents (sandboxed using Sandcastle, my OSS tool) to implement the code 3. I then have an automated reviewer agent look at the commits produced - one agent per implementation. This checks alignment to the original PRD, as well as code quality 4. These commits end up on branches that get PR'd to main 5. The planner agent runs again until all work has been completed The review is a crucial step - it's saved me MANY times. I am planning to massively increase the amount of review I do, hopefully with multiple agents. But guess what - AFK agents sometimes produce bad code. This can happen because of: a. The original plan was bad because the best solution was something different b. The original plan was bad because it didn't take into account all the unknown unknowns, and the AI had to make some decisions during the coding session which were bad c. The plan was good, but the AI just shat the bed (twice, once in the review stage, once during implementation) d. Your codebase is bad and the feedback loops don't tell the agent if it did a good job or not So... QA: Day Shift (part 2) 1. QA all of the branches created 2. Create follow-up issues, potentially editing the original PRD to adjust the destination This will usually take a long time, often as long as planning. But then you kick off the night shift again. Once QA is all done, you review the important bits of code manually, usually in PR's. There isn't anything better than the PR UI right now, so that's what we're stuck with. Wake-up Calls 1. If you let the AI run all night unbounded by planning, it's going to produce shit code 2. Mostly, my loops finish before I go to bed, it's just the night shift catching up to the day shift 3. The only reason I do AFK at all is because it allows me to automate review and totally not give a shit about latency 4. I always run night and day shift in parallel. I can't plan that far ahead (skill issue, probably). I need working code to base my plans from, so I'm aggressively QA-ing stuff that lands
Ronan Berder@hunvreus

Talking to smarter folks than me, I'm convinced many of the AI folks in my timeline are full of shit. Nobody is "running 20 agents over night" and building stuff for actual users. Maybe some are building internal tools or disposable software. Maybe. But building software people like using? That doesn't get hacked on day one or blow up after the 3rd user? Nope. I don't even understand what that's supposed to look like. Do you work out a 57 pages document that perfectly describes what you want to build and then summon 14 agents and have them run wild for 6 hours? And what comes out on the other end isn't a broken pile of shit? Nope. Not buying it. PS: it may also be that I have an IQ of 82 and can't figure it out.

English
62
61
1.2K
159.6K
Elie Steinbock — oss/acc
New video on Flue, Sandcastle, and doing it yourself! It covers: 1️⃣ Flue by @FredKSchott. An agent harness framework 2️⃣ Sandcastle by @mattpocockuk. A TypeScript library for orchestrating coding agents in sandboxes Thanks to @DmytroKrasun's ScreenshotOne for sponsoring!
English
4
3
14
1.1K
Ari
Ari@arealka·
@elie2222 @FredKSchott @mattpocockuk @DmytroKrasun I really like the agent having access to vercel agent browser to vaildate the changes it makes but this would require a really heavy image with chromium. How do you get a round that? I am always tempted to not use a sanbox at all 🙈
English
1
0
2
36
Ari
Ari@arealka·
@ChrisHayduk Thanks for this! If I wanted to use goal to implement a feature, should it be broken into smaller instructions (each being a seperate goal) or is passing a full “prd” sufficient?
English
1
0
1
2K
Aaron Levie
Aaron Levie@levie·
The need and opportunity for professional services and FDEs to deploy agents right now is massive. Every tech wave offers a new era of consulting and tech services requirements. Moving from analog to digital led to a massive wave in the 90s. Moving from on-prem to cloud did the same in the 2000s. But this is going to be at a scale far greater than the others. The reason is that agents fundamentally change the underlying workflows of an organization. Unlike most prior eras of technology, where it was a change in medium of the service being delivered (on-prem CRM to cloud CRM), agents rewire the business process itself. And unlike upgrading a tech system, business processes are full of idiosyncrasies. Every industry will have its own variants, and every department within those industries will have variants as well. Not to mention the bespoke difference between firms. Bringing agents to marketing in CPG will look different from marketing in healthcare. Bringing agents to sales in a B2B software company will look different from a car dealership. And none of the change is easy technically. You need to first modernize your infrastructure and data and make sure it’s ready for agents; access controls, entitlements, and permissions need to be mapped in a way that works for agents and people; you need to make sure agents have the right context to work with; you need to consistently eval and maintain the agents when there are model upgrades; and you need to drive the change management of the process itself to figure out which parts the people do and what agents do. That’s an insane amount of technical and domain-specific process work to be done to make this all happen. Huge opportunity for new service providers, as well as internally teams and roles to emerge, to help drive this change.
OpenAI@OpenAI

Today we’re launching the OpenAI Deployment Company to help businesses build and deploy AI. It's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production for business impact. openai.com/index/openai-l…

English
73
116
1.1K
226.5K
Ari
Ari@arealka·
@beknabdik @DeepgramAI @Cloudflare @beknabdik speko looks really cool. Abstract the nitty gritty wiring away from me and let me focus on the orchestration layer and high level customisation of the voice model/s implementation 👌
English
0
0
0
19
Ari
Ari@arealka·
A few weeks ago, I built a voice agent with cloudflare durable objects, deepgram agents api and telnyx. Latency and performance was great as @DeepgramAI is built on cloudflare infra, but I was frustrated not having control over orchestration. Keen to check this out 🙏🏽@Cloudflare
Cloudflare@Cloudflare

Stop building laggy voice bots! We’re hosting a hands-on workshop to build a voice agent that actually listens and remembers. With the Cloudflare Agents SDK & Workers AI, we’ll implement Streaming STT/TTS (Deepgram Nova + Kimi), Interruption handling, Voice-triggered tool calls all with zero external API keys. Join @fayazara on April 23, 4 PM SGT for a live hands-on workshop. Register here: cfl.re/48XpDuw

English
2
0
2
88
Ari
Ari@arealka·
@beknabdik @DeepgramAI @Cloudflare BYO agent sdk for orchestration would be golden. @DeepgramAI trying to create a level of vendor lock-in by making the “think” Setting’s property the only way to control the agent’s logic layer.
English
0
0
0
22
Bek
Bek@beknabdik·
@arealka @DeepgramAI @Cloudflare that orchestration wall is the real one. deepgram agents api nails latency but mid-turn state lives inside their runtime. once you need to steer it, stitched-loop is the only way.
English
2
0
2
67
Ari
Ari@arealka·
@mattpocockuk The tangled mess this may create without the correct setup makes it a non trivial problem. But ultimately markdown and code need to be close. “Things that change together should stay together.”
English
0
0
0
30
Ari
Ari@arealka·
I’d love to nest my codebases in my obsidian vault. If nested in the correct way, my coding agent will have access to all the fresh context it needs (amongst other benefits). @obsdmd is not built for this. Does such a platform exist?
English
1
0
0
36
Ari
Ari@arealka·
I really hope we are nearing the end of multi tenancy software. Single tenancy is far less complex and more secure.
English
2
0
0
22
Ari
Ari@arealka·
Perhaps there is a future where consumers using Claude/codex can browse and purchase “base apps” (whether that’s a codebase or skill), tailor it and then deploy it to their own dedicated environment.
English
0
0
0
14
Ari
Ari@arealka·
@NaomiLGBT @DeepgramAI @Cloudflare Sorry for being vague. Perhaps orchestration wasn’t the correct term. I was referring to being constrained by the “think” argument when designing the logic layer of the voice agent. I’d love to be able to use the agent sdk of my choice while use deepgram for everything else.
English
0
0
0
39
Naomi Carrigan
Naomi Carrigan@NaomiLGBT·
@arealka @DeepgramAI @Cloudflare Heya Ari~ I'd love to hear more about what you mean by nothing having control over orchestration. Is this something you would like to see Deepgram add functionality for?
English
1
0
2
37
Ari
Ari@arealka·
@RhysSullivan My setup changes every 36 hours 😢
English
0
0
0
5
Rhys
Rhys@RhysSullivan·
has anyone built "the software factory" thing, misc requirements: - able to use my subscriptions (codex / claude) - stacked diffs w/ graphite - able to go from planning -> lots of small tasks - closing the review loop don't love current agent interfaces, want something new
English
138
5
474
41.1K