Definite

57 posts

Definite banner
Definite

Definite

@definiteapp

A data team that never sleeps.

Philly Katılım Aralık 2022
31 Takip Edilen474 Takipçiler
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
Adding data sources is the most frustrating part of setting up analytics. Fortunately, Opus 4.5 is really good at it:
English
1
1
1
258
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
we're building our agent (@definiteapp) with claude code. We use Skills to tell Claude how to use features within our product. This keeps the main CLAUDE md small, while still giving the agent all the details it needs. e.g. we have a data-lineage skill and claude knows how and when to use it based on context in the app
English
0
2
4
933
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
The average company buys Snowflake, then Fivetran, then a BI tool, then spends 6 months connecting them. We spin it all up in 30 seconds.
English
0
1
10
878
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
This comment on HN precisely articulates why text-to-sql fails. You need "codification" (i.e. semantic layer) to use agents in analytics. Having an LLM write bespoke SQL to answer every question will fail fast. e.g. if you ask for "revenue by month" against a Snowflake warehouse with hundreds of tables, you are guaranteed to get different answers over multiple attempts. @definiteapp uses an agent to help you build a semantic layer so you get the same MRR every time you ask about MRR.
Mike Ritchie tweet media
English
0
1
3
473
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
It shouldn’t take 6 months and 5 vendors to make sense of your data. @definiteapp raised $10M to give you an AI-native data platform in an afternoon. Why does it normally take so long? Because the modern data stack is split across three heavyweight products: 1. Data Warehouse: a place to store all your data 2. ETL: pipelines to get data into your warehouse 3. BI: a place to build reports & dashboards Each of these comes from a different vendor, each with its own price tag and complexity. On top of that, you need data engineers and analysts to stitch them together and keep the system running. Let’s look at one of our customers, a b2b SaaS that raised $20M. Their current tech stack looks like many of our customers: * Postgres for their application database * Hubspot for CRM * Stripe for payments * A slew of other sources (customer success platform, Google Sheets, and event data) They wanted the same thing every company wants: a single source of truth for how the business is performing. Traditionally, this is done with dashboards. Their plan was to hire a few data people, buy a bunch of tools (Snowflake, Tableau and Fivetran) and hope for the best. Six months and a lot of pain later, maybe they’d have a few dashboards. Fortunately, they tried Definite one morning and by that afternoon their key metrics were live. How do we do it? Definite ships a full data stack (warehouse + pipelines + metrics & reports) in a single app. We’re built on the best open source infrastructure in the market with an AI agent that’s eager to make sense of your data. Instead of hiring, buying, stitching, and waiting; you have the answers you need in minutes running on a data stack that scales. And it's just a chat away. A huge thank you to our partners at @costanoavc (@JohnCowgill) and @AcrewCapital (@asadkhaliq), to our incredible team, and to our early design partners who have helped shape the product. And we’re hiring. If you’re a high-agency builder who wants to help reinvent analytics for thousands of operators, let’s talk —> mike@definite.app
Mike Ritchie tweet media
English
9
2
27
1.5K
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
We just open-sourced our @meltanodata target for Ducklake (from @duckdb). * Type conversion is automatic (timestamps stay timestamps). * Append or merge: choose at runtime. * Storage is portable (S3, GCS, or local). * Works with Postgres, MySQL, SQLite, or DuckDB catalogs. * Timestamp and categorical partitions are built in. We're already running it in production here at @definiteapp.
English
1
3
35
1.7K
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
It's a massive pain to be "data driven". You need: * a place to store all your data (datalake) * pipelines to get data into the datalake (ETL) * a workspace to munge data and share analysis (BI) * a data team to set all this up and answer questions Skip all of this. Get @definiteapp. 1. Let's start by adding Stripe:
English
2
3
9
2.2K
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
We start you on third base at @definiteapp. You get a base data model for all our most popular data sources (e.g. @stripe , @HubSpot, @attio, Quickbooks, etc.). For example, our Stripe model covers MRR, churn, expansion revenue, and lifetime value. You can use all these metrics instantly after you connect Stripe. If you want to change the way discounts are handled, it's as easy as updating the SQL.
Peer Richelsen@peer_rich

the data you get from @stripe via API is so fucking messy its incredibly hard to replicate something as easy as MRR why don't you give me some data endpoints that just return: month | MRR its ridiculous

English
1
1
6
1K
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
@rauchg @definiteapp is v0 for data. spin up a data lake, pipelines (e.g. sync your stripe data) and dashboards all via our agent.
English
1
1
9
977
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
our demos at @definiteapp are a little longer than this, but the vibe is identical
English
0
1
1
305
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
We (@definiteapp) have a PR open to add predicate pushdown to the DuckDB Iceberg extension. This is a 0 to 1 change for many production @ApacheIceberg lakes. @duckdb currently downloads every parquet file for every table referenced. This is... not ideal. Most queries would just bomb. With predicate pushdown, we see: * simple queries: from several seconds --> subsecond * complicated queries: From OOM errors --> query actually executes
English
6
9
121
10.3K
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
i guess we need light mode to sell to enterprise. fine, here it is ---> definite.app
English
0
1
8
735
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
⚫ Definite 2.0 This is the biggest release since we started Definite and I’m insanely proud of it. We’ve completely redesigned the product with meticulous attention to every detail. We’ve added scalable storage to @duckdb using @ApacheIceberg. And our AI Assistant (”Fi”) is now ready to help you get shit done. It was painstaking and took months to get right, but the result is worth every minute. Incredibly proud of the team here for pulling this off.
English
8
1
24
5K
Mike Ritchie
Mike Ritchie@thisritchie·
We did a ton of research on query engines for Iceberg while building @definiteapp. @stevowang wrote up our findings on: Snowflake, Trino, Spark and DuckDB. There's a deep dive on @duckdb (obviously) with a notebook you can run yourself. I'd love to see a tighter duck <---> iceberg integration (e.g. predicate pushdowns), but the extension is a great start. Full post here: definite.app/blog/iceberg-q… #DataAnalytics #DataLakehouse #Iceberg #Snowflake #Spark #Trino #DuckDB #DataEngineering
Mike Ritchie tweet media
English
6
6
118
12K
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
Good read here on how and why @NotionHQ built a data lake. tldr: they saved over $1M by moving off Fivetran + Snowflake. They also set up CDC (change data capture), so ingestion times dropped from days to hours. If you read this and think "wow, that sounds great, but we'd never build this ourselves", check out @definiteapp (spoiler: we built it for you).
Mike Ritchie tweet media
English
3
1
8
1.4K
Definite retweetledi
Mike Ritchie
Mike Ritchie@thisritchie·
If you're a SaaS founder or PM, you undoubtedly have "better analytics" somewhere on your roadmap. Embedded analytics or "customer-facing analytics" is the data you show your customers within your app. It's a "slippery slope feature". Very easy to get started, hard to finish (note: it's never finished). 1. Where do you store the data to power analytics? Your production database? Your data warehouse? What if we don't have a data warehouse? 2. How do we get data our customers would want to see from external apps? e.g. we want to show them some data from Hubspot matched to our own app data. 3. How can we make it simple enough for non-technical users to get value from it? @definiteapp has very good answers to these questions. 1. We spin up a data warehouse for you 2. We have 500+ connectors to pull in external data and we support all major SQL databases to get your customer data from 3. We have an AI assistant to answer adhoc questions and a simple query builder to generate standard reports Many of our customers at Definite have been asking to embed analytics from Definite into their own app, so we're making that easier! We've launched our API, Python SDK and will be rolling out better embedding features (including our AI data assistant) soon!
Mike Ritchie tweet media
English
1
1
5
863