Artem Keydunov

994 posts

Artem Keydunov banner
Artem Keydunov

Artem Keydunov

@keydunov

Co-Founder & CEO @the_cube_dev, Agentic Analytics Platform powered by Semantic Layer https://t.co/vgaKOi7Vmo

San Francisco, CA Katılım Ağustos 2010
279 Takip Edilen738 Takipçiler
Sabitlenmiş Tweet
Artem Keydunov
Artem Keydunov@keydunov·
Introducing Cube. The first platform where AI builds your data model for you. Get accurate answers, dashboards and reports automatically. No hallucinations. All grounded in the semantic layer. Cube is free to start at cube.dev
English
3
3
9
748
Artem Keydunov
Artem Keydunov@keydunov·
Three months after GA: Transforming teams at Brex and Drata. 100x faster semantic modeling. 90% fewer AI hallucinations.
English
0
0
1
72
Artem Keydunov
Artem Keydunov@keydunov·
Cube is AI that writes your semantic layer so everything downstream is accurate. Scans your warehouse and models your data automatically. Writes and adjusts metrics definitions on the fly. Answers questions grounded in your business context. Generates dashboards.
English
1
0
1
80
Artem Keydunov
Artem Keydunov@keydunov·
Introducing Cube. The first platform where AI builds your data model for you. Get accurate answers, dashboards and reports automatically. No hallucinations. All grounded in the semantic layer. Cube is free to start at cube.dev
English
3
3
9
748
Artem Keydunov retweetledi
Matt Hartman
Matt Hartman@MattHartman·
If you are building or building *with* open source AI, join the first open-source AI conference, live streamed April 16th with speakers from the top open-source companies in the world
Matt Hartman tweet media
English
3
18
84
37.9K
🦉DVC
🦉DVC@DVCorg·
🔗 DataChain open-source release 🤖 AI-Driven Data Curation: Local models, LLM APIs 🚀 GenAI Dataset scale: Millions and billions of files 🐍 Python-friendly: Python objects instead of JSON Try it out github.com/iterative/data… 👇1/7
English
10
22
65
7.6K
Artem Keydunov
Artem Keydunov@keydunov·
Having a semantic layer on top of your SQL data warehouse provides 6x higher accuracy for LLM-powered applications. A group of researchers from data.world has published a paper studying the accuracy of LLMs in answering questions with a text-to-SQL pattern and comparing those results with the accuracy of introducing a knowledge graph into the process. Text-to-SQL achieved an accuracy of 16% directly. However, the accuracy increased to 54% when questions were posed over a Knowledge Graph representation of the SQL dataset. In addition to the results, the paper also presents the detailed methodology for evaluating the accuracy of LLMs in answering business questions asked of a database. They provide a schema, sample data, natural language questions, and expected answers in a format that is fairly easy to replicate for their experiment. Our partners at Delphi, who have built an AI-enabled analytics application on top of semantic layers, quickly realized that this methodology could be adapted to perform a benchmark of the accuracy of their own solution — Text-Delphi-Cube-SQL. They have recently published some interesting findings of their own, demonstrating that they were able to achieve an outstanding 100% accuracy with Delphi connected to a subset of the data.world test modeled in @the_cube_dev as the semantic layer. This incredible result validates the ability of semantic layers to provide the necessary context to improve LLM accuracy. cube.dev/blog/semantic-…
English
0
0
8
363
Artem Keydunov
Artem Keydunov@keydunov·
The usage of semantic layers for LLMs is gaining momentum. Just this week, multiple visualization vendors, including AWS Quicksight, have released semantics support in their products to enhance LLMs' accuracy atop proprietary SQL data warehouses. The necessity for the semantic layer is backed by a recent research paper from the data.world team. This paper introduces a benchmark comparing the accuracy of the question-answering system of Text-to-SQL versus Text-to-Knowledge-Graph-to-SQL. The results show that using the knowledge graph provides 3x higher accuracy compared to direct text-to-SQL. I’m excited to speak at the @AirbyteHQ move(data) conference next week about the history of semantic layers and their critical role in the future of AI-powered data stacks. movedata.airbyte.com/event/semantic…
Artem Keydunov tweet media
English
0
4
10
767
Artem Keydunov
Artem Keydunov@keydunov·
Yesterday, AWS announced Amazon Q in QuickSight. It uses gen AI to access data with natural language. It's exciting to see, and I’m certain more BIs will follow suit. But it creates a big problem one might not see on the surface. Amazon Q requires an extensive setup and data preparation to ensure accurate operation. It makes you describe every column in your database, provide synonyms, and define metrics and business concepts. All this metadata is stored inside QuickSight. With other BIs introducing AI-powered capabilities, semantics scattering will become an even bigger problem than it is today. All BIs will require data teams to input semantics into the systems so AI can operate correctly. The amount of duplicated semantic metadata across multiple BI tools will grow exponentially. BI fragmentation will not go anywhere, and organizations will adopt more and more BI and visualization tools over time. The only solution to keep semantics DRY is to use a universal semantic layer that can work across multiple data sources and multiple data visualization tools.
Artem Keydunov tweet media
English
0
1
6
694
Artem Keydunov
Artem Keydunov@keydunov·
#4 is exactly the feedback we are receiving from our users and community - organizations need a semantic layer across the stack, both for humans and for LLMs now.
Tomasz Tunguz@ttunguz

At the IMPACT Summit yesterday, I shared our Top 10 Trends for Data in 2024. 1. LLMs Transform the Stack : Large language models transform data in many ways. First, they have driven an increased demand for data and are causing a complete architecture inside companies. Second, they change the way that we manipulate data. Analysts will use automated data analysis, and it will be an expected tool in every product : notebooks, BI, databases, etc. If you’re curious about the evolution of the LLM stack or the requirements to build a product with LLMs, please see Theory’s series on the topic here called From Model to Machine. 2. Data Teams are Becoming Software Teams : DevOps created a movement within software development that empowers developers to run the software they wrote. The same thing is happening in data. Products have filled those needs by mapping each of the core functions and responsibilities in the dev movement data ops. Most sophisticated data teams run like software engineering teams with product requirement documents, ticketing systems, & sprints. 3. Data Products : The combination of large language models and data teams becoming software teams has led to data products. Whether it’s data being used inside applications, feeding machine learning models, or downstream analysis, companies are increasingly reliant on this data, and that’s not changing. 80% of data is unstructured within organizations. LLMs are fantastic first-pass filters and phenomenal classifiers that extract insight or build machine learning features from unstructured data like customer support conversations or sales calls. 4. The Semantic Model Becomes a Must-Have: Semantic models unify a single definition across an organization for a particular metric. Looker did this within the context of a BI system. But organizations need this layer across the stack. In addition to the reusability of definitions, composability - creating complex analysis with simple building blocks - will define this layer, both for humans who find it easier to understand and for large language models that synthesize semantics. 5. Instrumentation and Governance Enable Many New Use Cases : Today’s data leaders are struggling. Executive teams and boards are demanding innovation with LLMs and data. Meanwhile, regulation and compliance mean the governance burden only increases. Software startups are rising to meet the need. Data contracts encode the data interchange between two different departments (Gable). BI systems marry the centralized control of data teams with the ability to define and promote metrics at the edge of an organization (Omni). Observability systems measure the uptime of pipelines and detect anomalies (Monte Carlo). Semantic understanding of code and ephemeral developer environments enables data engineers to reduce costs and work more fluidly together (SQLMesh). 6. The Pendulum Swings to Small Data : Modern Mac laptops have the same computational power as the AWS servers Snowflake used to launch the company. Since most workloads are small, data teams will use in-process, in-memory/in-process databases to analyze data and move data. They are faster to get started (no account creation), they can scale very quickly, and they can rise to enterprise levels with commercial cloud offerings. 7. Cost Pressures Continue : The dominant theme of 2023 is doing more with less. Looking at Snowflake’s net dollar retention over the last few years, it’s clear exactly when the office of the CFO became an important voice within the data world. This is leading to a trifurcation of workloads : offloading workload from the most expensive queries to less expensive query engines (in-memory & data lakehouses) where slightly higher latencies and different performance characteristics work well. 8. Juggernauts Dueling : Whether it’s Snowflake vs Databricks competing over structured data workloads, or Microsoft Fabric and Databricks competing over unstructured large scale data processing, or Google and Amazon competing over LLM deployments technologies, or Microsoft and OpenAI cooperating/competing in the enterprise, 2024’s data landscape will be shaped by these battles. 9. Consolidation : Data companies have produced a huge amount of consolidation in the last few years, and given the competitive dynamics, the rapid growth rates within the ecosystem, which are significantly faster than overall software spend, higher multiples afforded to these businesses, we should expect to see a lot of M&A in 2024. 10. The Decade of Data Continues : The pace of innovation within the data world continues to accelerate due to data. And so the decade of data continues. Google Slides : docs.google.com/presentation/d…

English
0
3
8
1.2K
Artem Keydunov
Artem Keydunov@keydunov·
At dbt Coalesce last week, I gave a talk on designing the semantic layer and its integration into the data stack. Following the talk, @brianbickell and I delved deeper into this topic, making a case for an open standard for the semantic layer - cube.dev/blog/the-need-…
Artem Keydunov tweet media
English
0
0
1
231
Artem Keydunov
Artem Keydunov@keydunov·
The Cube team is at #Coalesce23 conference this week! We're excited to talk about all things semantic layer and showcase our new integration with dbt. If you're here too, come by our booth and say hello!
Artem Keydunov tweet media
English
0
0
6
345
Artem Keydunov
Artem Keydunov@keydunov·
Excited to launch our dbt integration today! 🚨 🚨 🚨 In this blog, we are announcing our integration and laying out a framework on how to use transformations and the semantic layer together. cube.dev/blog/introduci…
English
0
0
0
116
Artem Keydunov
Artem Keydunov@keydunov·
I’m excited to finally announce first-class support for Python in Cube 🚀 🚀 🚀 Now, you can fully configure Cube using just YAML and Python; no JavaScript required. cube.dev/blog/introduci…
English
0
0
5
249
Artem Keydunov
Artem Keydunov@keydunov·
🚀 Exciting news! Cube is taking BI tool integration to the next level. Introducing Semantic Layer Sync with Tableau! We're thrilled to support Tableau users in their data journeys. cube.dev/blog/introduci…
English
0
1
4
168
Artem Keydunov retweetledi
Stacey Wueste
Stacey Wueste@staceywueste·
16,000+ @github stars agree 🔥@the_cube_dev semantic layer plays a key role in ensuring correctness & predictability when building text-to-sql LLM-based apps. Cube's integrations w/ @langchain and @duckdb make it easy to get started with building high quality AI applications
Stacey Wueste tweet media
English
1
4
10
693
Artem Keydunov
Artem Keydunov@keydunov·
Excited to share that Cube has been named a Leader and Fast-Mover in GigaOm's Sonar report for Semantic Layers and Metrics Stores. 🚀🚀🚀 cube.dev/blog/cube-reco…
English
0
2
6
291
Artem Keydunov
Artem Keydunov@keydunov·
Excited to launch our integration with LangChain today! 🚀 The semantic layer plays a key role in ensuring correctness and predictability when building text-to-sql LLM-based applications. cube.dev/blog/introduci…
English
1
7
16
1.7K