dltHub

408 posts

dltHub

@dltHub

dltHub is the creator of data load tool (dlt)

Berlin Katılım Kasım 2022

21 Takip Edilen588 Takipçiler

Sabitlenmiş Tweet

dltHub@dltHub·14 May

Building pipelines with AI usually means losing context between tools. dltHub AI Workbench runs the full 12-step workflow as a continuous session, schemas, incrementals, traces, transformations, and notebooks share context across the stack. dlthub.com/blog/agentic-d…

English

406

dltHub@dltHub·7 Tem

How do you eval your agents? Agent traces are a good place to start. In this 1-hour workshop with @DataTalksClub, you'll learn how to ingest agent traces, model nested JSON into a queryable schema, and build dashboards to understand agent behavior. youtube.com/watch?v=A0LmmZ…

YouTube

English

289

dltHub@dltHub·6 Tem

AI writing pipelines is cool. AI remembering the entire workflow is better. dltHub Pro carries context from ingestion to deployment to maintenance instead of starting over every step. Try it: dlthub.com/products/dlthub

English

117

dltHub@dltHub·29 Haz

AI agents generate traces, but are you analyzing them? Join Alena Astrakhantseva to learn how to turn tool calls, token usage, and outcomes into structured, queryable data with dltHub Pro. July 6, live on YouTube with @DataTalksClub Register ↓ luma.com/3sint80a

English

dltHub@dltHub·26 Haz

AI memory just got smarter!

Vasilije@tricalt

Introducing Cognee v1.0: a major breakthrough in agentic intelligence. It is 145% better than Opus 4.8 and GPT 5.5 at long context memory retrieval. Cognee allows a 100 BILLION token context window 100,000x more than Claude. It's: - 6.9x cheaper than GPT 5.5 and Opus 4.8 - Cold starts in 350ms & searches in 260ms Why this matters: Today agents forget important context, redo tasks, waste tokens, and slow down as workflows get more complex. Cognee solves this. It’s not a place to build agents. It connects to the agents you’ve already built, across any platform, and makes them significantly cheaper, faster, and more accurate. Here's how it works:

English

140

dltHub@dltHub·25 Haz

The common fix is a semantic layer added on top. Now you maintain it twice and the two drift. We build from the other end: write the canonical knowledge layer first, use that one spec to generate the model and answer questions over it. dlthub.com/blog/canonical…

English

dltHub@dltHub·25 Haz

Text-to-SQL doesn't break because models can't write SQL, it breaks because they don't know what your data means. An agent can return valid SQL and still be wrong, and a clean wrong number looks just as trustworthy as a right one.

English

109

dltHub@dltHub·24 Haz

Human in the loop shouldn't mean copy-pasting context into an agent every 5 minutes. It should mean judgment, not errands. If your agent needs a Rube Goldberg machine of prompts, tabs, and slack messages, the problem isn't human. It's missing context. dlthub.com/blog/context

English

128

dltHub@dltHub·20 Haz

The catch: on a public dataset, schema-only scored 8/10. Looked grounded, but it was guessing from training data, not reading the pipeline. 🚨 Full benchmark by Roshni Melwani (60 responses + repo): 🔗 dlthub.com/blog/ontology-…

English

121

dltHub@dltHub·20 Haz

The clearest case: asked if 55% at-risk seats = churned, the schema-only model said no, but guessed. It didn't know the real threshold was 60%. Ask about 65% and it'd still say no. The ontology model said no because 55% < 60%. That reasoning generalizes. 👀

English

110

dltHub@dltHub·20 Haz

Gave an LLM a schema: 3/10. 📉 Gave it a schema + an ontology: 10/10. 🎯 Same model, same data. The difference? It finally understood what the columns actually mean instead of just vibing off the names.

English

134

dltHub@dltHub·18 Haz

The role doesn’t disappear, it recomposes. The hard part shifts from writing pipelines to making business knowledge explicit & structured, so an agent can build and a team can verify what it built. Full piece dlthub.com/blog/the-rise-…

English

dltHub@dltHub·18 Haz

Agents can generate pipelines, models, and dashboards. They can’t generate what a customer is, when an order counts as revenue, why a definition excludes what it excludes, or where historical breaks in your systems are. Most of this is still undocumented, in people’s heads.

English

dltHub@dltHub·18 Haz

91% of the 81,000 new dlt pipelines shipped in January 2026 were built by agents. The bottleneck in data engineering is no longer implementation, it's meaning.

English

dltHub@dltHub·9 Haz

How much data can $1 of compute move? We benchmarked dltHub on a small worker (2 vCPU / 4 GB) loading into BigQuery: Parquet: ~170 GB Postgres: ~65 GB JSON: ~4.6 GB REST: whatever the API allows Methodology + results ↓ dlthub.com/blog/benchmark…

English

127

dltHub@dltHub·5 Haz

The new dltHub AI Workbench Data Quality Toolkit starts from context your pipeline already knows: schema contracts, keys, constraints, and sampled values. Plain-language business rules → checks that run on every load. dlthub.com/blog/dq-toolki…

English

121

dltHub@dltHub·5 Haz

Data quality usually starts too late. By the time bad data hits a dashboard, the original business assumptions are gone. We're fixing that at the ingestion layer. 👇

English

149

dltHub@dltHub·1 Haz

@inngest @lightdash_devs @Streamkap At dltHub, @elviskahoro will demo our new Transformations public preview. Expect demos from practitioners and startups building real-world AI and data systems, and a look at what agentic analytics looks like in practice. 📅 June 3 · SF luma.com/gl94l89s

English

134

dltHub@dltHub·1 Haz

Agentic Analytics Demo Night lands in San Francisco this Wednesday 🚀 We're joining @inngest, @lightdash_devs, and @streamkap for an evening of demos from teams building at the intersection of AI, analytics, and data infrastructure. 🧵

English

165

dltHub@dltHub·1 Haz

@sameer_alsakran @metabase Sameer started Metabase as a side project, waited years before charging, ignored conventional SaaS advice, and built a product now used by 90k+ companies with 8-figure ARR. We'll discuss OSS, AI, customer feedback, and building products that last. luma.com/bv3xvi4f

English

103

dltHub@dltHub·1 Haz

What does it take to build a product people actually keep using? On Wednesday, we're hosting @sameer_alsakran, Founder & CEO of @metabase, at our office in Berlin for a live conversation with Francesco Mucio from Data Berlin. 🧵

English

283

Keşfet

@DataTalksClub @inngest @lightdash_devs @Streamkap @elviskahoro @sameer_alsakran @metabase @elonmusk