Sabitlenmiş Tweet
PME
1.3K posts

PME
@itsyourcode
savage coder | creator of the data agent @ https://t.co/w8zk05hYUU (@probablydatabot)
San Francisco, CA Katılım Haziran 2023
2.2K Takip Edilen474 Takipçiler


@ClickHouseDB cloud having issues right now? No status updates but something feels a bit off...

English

This is also why they cannot do data analysis without a rigorous harness. By definition all input data cannot have been memorized or "pre-reasoned" over during training.
Lossfunk@lossfunk
🚨 Shocking: Frontier LLMs score 85-95% on standard coding benchmarks. We gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. Presenting EsoLang-Bench. Accepted to the Logical Reasoning and ICBINB workshops at ICLR 2026 🧵
English

The sum of all this is that data analytics is a deceptively hard use case for the state of the art transformer architecture.
These problems are solvable, but are incredibly hard harness, data infrastructure, and UX engineering problems.
Right now there are probably less than 100 engineers in production who deeply understand how fundamentally hard this problem is. There have been a great many failed attempts already, including Anthropic itself (sunsetted their very early attempt at this after only a few months).
If you are going to trust an AI agent for data analytics, you must choose your harness very carefully.
(Or you need to invent a new transformer architecture, whichever you prefer)
If the harness cannot guarantee solutions for ALL of the aforementioned problems (and more), you are going to find yourself in a similar situation described by this Redditor; and that is not a fun place to be.
English

5) LLMs are "data blind" -- this is a big one.
They can't really "see" the data the way people are assuming they can.
This is arguably one of the hardest problems to solve.
They are looking at data through a series of keyholes, and worse yet, you do not know which ones.
Therefore, the data must be presented to them in such a manner that forces them to actually see and consider all the facets of an arbitrary dataset completely. They cannot be relied on discover it completely on their own.
All this must happen while: fitting inside context windows, avoiding attention diffusion (aka noise), or losing coherence over many turns, revisions and progressively larger and more complex data sources.
English

@itsyourcode Different parts of a system to build the system for teaching kids based on their individual needs. Middle school teacher
English

@itsyourcode True, but this was just an example.
My way of doing is: envision the correct version of your intent. Drill down on what you want to built exactly. The end goal. Is it a cli, core plus adapter or whatever.
And then you go in an vsto / htn approach until you are done.
English

@itsyourcode Also we live now in the time where we will see faster and faster iterations of cheaper and better llms come out. Just habe a look at qwen3.5 and minimax 2.7
We are off to the races. Self improving llms based on auto research and improved versions from karparthy
English

@itsyourcode Might be, but there are only so many hours in a day.
And imo you are always thinking about the same:
Tech debt, gaps, code smell, user stories, bugs.
So I could just feed prompts to address these aspects and I am good? Spending my time on the meta or meta meta level.
English




