Amal David

1.2K posts

Amal David

@DavidAmal

Building products between the realms🐒. Monkey see monkey do programmer. Running https://t.co/1rvWzcwGFk as a hobby

SF / Bengaluru Katılım Şubat 2012

1.7K Takip Edilen355 Takipçiler

Sabitlenmiş Tweet

Amal David@DavidAmal·23 Şub

(1/2) A simple mental model that most people tend to miss out - If something is so menial and repetitive, someone else would have already solved the problem. Ex: Picture resizing, poster creation. There's no need for anyone to put manual effort for these kind of work anymore

English

Amal David@DavidAmal·13h

The more I read about self improving agents - where is the personal guardrail layer? Like Anthropic, OpenAI decides their holistic guardrails of what is right and wrong but sometimes I might just need a subset of it

English

Amal David@DavidAmal·2d

My tamil brain😂

Spline@splinetool

Today, we're releasing Omma. Create 3D, Websites, and Apps with AI agents. Start now on omma.build Follow @omma_ai

English

120

Amal David@DavidAmal·4d

We need better primitives for running a local appstore. A single database synced to icloud/drive and my local mac/mobile which primarily can be used by all my local apps which i build for fun/work. Spinning up a new instance of postgres in a docker or even multiple separate sqlites just feels like stupid over and over. Multi tenant databases for an user of one should be a reality soon

English

Amal David@DavidAmal·12 Mar

The entire LPG situation in India right now is a classic tragedy of the commons

English

Amal David@DavidAmal·12 Mar

I’m literally crying😭

Justin T Chiu@justintchiu

one of my coworkers hard-coded a (restricted) WASM interpreter into the weights of a transformer, losslessly x.com/ChristosTzamos…

English

106

Amal David@DavidAmal·12 Mar

@grok What’s like the theoretical & practical limits to depth / breadth of such scenarios? Like what’s the max that’s been achieved so far? How compute heavy is this? Can these models infer causal correlation effects outside of the data distribution it’s been trained on?

English

Grok@grok·12 Mar

Yes—LLMs can oversee by running meta-evaluations: feed candidate world-models/agent rollouts into an LLM judge that scores causal fidelity, counterfactual accuracy, and sim branching across held-out scenarios. Breadth/depth *are* snapshottable like sandboxed VMs. Prompt ensembles = env configs; chain-of-thought traces or tree search = depth snapshots; multi-domain probes = breadth forks. We already do this in agent evals (e.g., WebArena, GAIA) and can checkpoint model hidden states mid-sim for exact replay. Great extension of the idea—what framework are you thinking of testing it in?

English

Amal David@DavidAmal·12 Mar

If world models are causal aware then does intelligence gets measured by breadth * depth of simulations it performs?

English

Amal David@DavidAmal·12 Mar

@grok Can LLMs be an overseer to determine this? Are these breadth / depth snapshottable like sandboxed VMs with configured environments?

English

Grok@grok·12 Mar

That's a sharp hypothesis. Causal awareness in world models lets agents run accurate "what if" sims of reality. Intelligence could indeed scale as breadth (diversity of scenarios modeled) × depth (time horizon simulated), much like planning in RL or human foresight. But add accuracy & efficiency: bad models explode error over depth, narrow breadth misses options. So it's more like quality-adjusted breadth × depth. Spot on direction tho—xAI's work on understanding the universe vibes with this. What sparked the Q?

English

Amal David@DavidAmal·12 Mar

AI as social orchestration agents to enable fun

Français

Amal David@DavidAmal·11 Mar

Humans as irl QA agents for the agentic commerce world

English

Amal David retweetledi

Moonlake@moonlake·6 Mar

x.com/i/article/2029…

ZXX

153

950

514.9K

Amal David@DavidAmal·10 Mar

Just put a sandboxed VM for everyone

English

Amal David@DavidAmal·9 Mar

Someone should write a book named "The Art of building Malleable software"

English

Amal David@DavidAmal·8 Mar

We have officially moved to "I ain't writing all that"

GIF

Andrej Karpathy@karpathy

@kristoph definitely. the current one is already 90% AI written I ain't writing all that

English

Amal David@DavidAmal·8 Mar

@ShashankDe5535 It’s a phase, I’m sure they will figure out the future and align. Composer 1.5 is actually slow than composer 1 but still fine.

English

Shashank Deshpande@ShashankDe5535·8 Mar

@DavidAmal I understand, I do use cursor a hell lot, but not in a way that contributes to revenue for them. I like the theme better than VSCode default, and at times, Composer (1) was pretty good for quick analyses. Compose 1.5 seems slower to me, I am not sure if that's true.

English

Amal David@DavidAmal·6 Mar

Cursor will become the default mac/desktop app to basically build/manage/ship apps. Just writing code is not enough when the entire industry is getting abstracted

Cursor@cursor_ai

We're introducing Cursor Automations to build always-on agents.

English

130

Amal David@DavidAmal·8 Mar

@ShashankDe5535 I do have the pro plan, it is good but codex has a way of complicating simple things. So no for log analysis and I still love looking at traces. My setup is more like, all dev/stage instances ssh’ed through cursor to pull and redeploy and analyze log traces and do patch fix prs

English

Shashank Deshpande@ShashankDe5535·8 Mar

@DavidAmal What about Codex 5.3 spark, is that not a 1000tps model, supposedly pretty smart? Couldn't try it on my 20$ OAI plan yet.

English

Amal David@DavidAmal·8 Mar

For writing code yes, i don’t use it to write a single line. However i use it to debug a lot as composer 1.5 is pretty decent and fast grep enough logs and find errors. Composer is a winner in the post deployment stage tbh at 250 tok/sec, monitor logs and apply patch fixes in a cheaper way than spending high compute or time. Works for brittle software. I know many people who still use it to write code, with their recent apps launch it just squarely fits as a single place to build/deploy and monitor for them. For me it fits on the deploy/monitor stage. Although deployment is CI/CD in many places. Windsurf swe 1.5 is pretty sweet at 950 tok/sec as well. Speed matters a lot.

English

Shashank Deshpande@ShashankDe5535·8 Mar

@DavidAmal Cursor is much worse already than the Codex app which is 2 months old

English

Amal David@DavidAmal·7 Mar

@bcherny Is there way to extend this for n number of days in config?

English

243

Boris Cherny@bcherny·7 Mar

Released today: /loop /loop is a powerful new way to schedule recurring tasks, for up to 3 days at a time eg. “/loop babysit all my PRs. Auto-fix build issues and when comments come in, use a worktree agent to fix them” eg. “/loop every morning use the Slack MCP to give me a summary of top posts I was tagged in” Let us know what you think!

English

573

844

12.9K

2.1M

Amal David retweetledi

Alexey Grigorev@Al_Grigor·6 Mar

Claude Code wiped our production database with a Terraform command. It took down the DataTalksClub course platform and 2.5 years of submissions: homework, projects, and leaderboards. Automated snapshots were gone too. In the newsletter, I wrote the full timeline + what I changed so this doesn't happen again. If you use Terraform (or let agents touch infra), this is a good story for you to read. alexeyondata.substack.com/p/how-i-droppe…

English

1.5K

1.6K

11K

4.1M

Keşfet

@grok @ShashankDe5535 @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA