Harsha Chintalapani

2.9K posts

Harsha Chintalapani

Harsha Chintalapani

@d3fmacro

Co-Founder of Collate, OpenMetadata Committer, PMC of Apache Kafka, Storm Ex-Uber, Hortonworks, Mozilla, Yahoo! Apache Kafka & Apache Storm Committer & PMC

Katılım Ağustos 2014
308 Takip Edilen489 Takipçiler
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
We polled data leaders on AI and governance integration. 100% were in the bottom two tiers. AI that can't read your governance context infers instead, and does so confidently. Here's what fixing that looks like: buff.ly/PYMWHob #DataGovernance #SemanticIntelligence
CollateData tweet media
English
0
2
2
24
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
Collate 1.12 just dropped the most detailed data diff tool I have seen. Here is the problem: your DBT model runs, the data looks different, and you have no idea why. You export both tables to Excel, create VLOOKUP formulas, and spend an hour manually comparing rows. Or worse, you just hope the transformation worked correctly. The new data diff in Collate 1.12 ends that nightmare. What it shows you: • Column-level differences with color coding (renamed fields, missing columns) • Row-by-row comparison of matching records • Character-level differences within cells The demo compares a raw customer table against a transformed one. You can see that "ID" became "customer ID," spot exactly which rows are missing data, and identify character-level differences at a glance. No exporting. No spreadsheets. No guesswork. What surprised me: this level of granularity simply does not exist on other data observability platforms. They tell you tables differ. Collate shows you precisely how they differ. If you are debugging pipelines or validating transformations, this is a massive time saver. 🎥Watch and see: youtu.be/4xsM9qpEmvY #DataQuality #DataObservability #DataEngineering #DBT #DataManagement
YouTube video
YouTube
CollateData tweet media
English
0
1
2
30
Harsha Chintalapani retweetledi
OpenMetadata
OpenMetadata@open_metadata·
Data quality testing has been significantly upgraded in OpenMetadata 1.12. If you have ever written a custom SQL query for a data quality check, copied it 50 times for different tables, and then tried to explain to business users how to run it... you know the pain. The new Test Library changes the game. Here is what it does: • Turns your custom SQL into reusable test templates with dynamic variables • Makes those tests available to business users through a simple UI (no code required) • Gives you control over which tests are enabled across your organization The demo shows a column-level test built with {{table_name}} and {{column_name}} variables that resolve at runtime, along with a custom parameter that users can set when applying the test. Apply it once, scale it everywhere. Data quality should not require a computer science degree to implement. This is how you make it accessible. 🎥Watch and see: buff.ly/wPb1Xoj #DataQuality #DataGovernance #DataEngineering #OpenSource #DataManagement
OpenMetadata tweet media
English
0
1
3
133
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
Data quality testing has been significantly upgraded in Collate 1.12. If you have ever written a custom SQL query for a data quality check, copied it 50 times for different tables, and then tried to explain to business users how to run it... you know the pain. The new Test Library changes the game. Here is what it does: • Turns your custom SQL into reusable test templates with dynamic variables • Makes those tests available to business users through a simple UI (no code required) • Gives you control over which tests are enabled across your organization The demo shows a column-level test built with {{table_name}} and {{column_name}} variables that resolve at runtime, along with a custom parameter that users can set when applying the test. Apply it once, scale it everywhere. Data quality should not require a computer science degree to implement. This is how you make it accessible. 🎥Watch and see: youtu.be/021XrWkrFvM #DataQuality #DataGovernance #DataEngineering #OpenSource #DataManagement
YouTube video
YouTube
CollateData tweet media
English
0
3
5
108
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
🎥The new episode of Data 30 with Jo Perez is a great explanation of Knowledge Graphs, what they are, and why you should care. Dive into some real-world use cases on how Knowledge Graphs can impact your organization. 👉 Watch here: buff.ly/UMT2UbX
CollateData tweet media
English
0
4
5
44
Harsha Chintalapani retweetledi
Kim-Mai Cutler
Kim-Mai Cutler@kimmaicutler·
Building a software company was a marathon and now it requires sprinting the entire marathon distance.
English
14
19
227
29.2K
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
🚀 Collate version 1.12 has enhanced the auto-classification workflow, helping teams automatically identify PII and sensitive data. With the newly surfaced auto-classification engine, your team can now see exactly which recognizer flagged a column, what pattern it matched, how many times it matched, and how confident the system is in that tag. It supports human-in-the-loop so you can review tags to make sure they’re as expected. Key takeaways ✅Get full visibility into built-in NLP recognizers (credit card, email, and more) with the ability to enable, disable, or customize them 🎯Support for your own custom recognizers using regex patterns, keyword terms, or column name matching 💡An auditable tag feedback loop so governance teams can continuously improve classification accuracy This is what responsible data governance actually looks like -- not just automation, but automation you can understand, customize, and trust. If your team is managing sensitive data at scale, this video is worth a look. Watch the full demo youtu.be/CUsY_Zi3kJA #DataGovernance #DataCatalog #SensitiveData #PIIDetection #DataCompliance #Collate #MetadataManagement
YouTube video
YouTube
CollateData tweet media
English
0
3
5
41
Harsha Chintalapani retweetledi
OpenMetadata
OpenMetadata@open_metadata·
The following is from OpenMetadata's February Community Meeting, held on Wednesday, February 25, 2026, @ 9:00 AM PST. Catch the next OpenMetadata Community Meetup @ buff.ly/DOUI5Kj In this video, Collate’s Pere Miquel Brull introduced Collate’s AI Studio and OpenMetadata’s new AI SDK, enabling enterprises to build, deploy, customize, and tune AI agents to their unique data environments. 🎥 Watch here: buff.ly/aMaA6n2
OpenMetadata tweet media
English
0
1
4
88
Harsha Chintalapani retweetledi
OpenMetadata
OpenMetadata@open_metadata·
🚀 AI SDK is part of the 1.12 release of Openmetadata Avoid the common struggles of building AI applications that automate work for your data practitioners, especially for agents that need to understand data meaning and business context. AI SDK streamlines agent development in Python, Java, and TypeScript by providing a self-contained agent framework that leverages OpenMetadata's semantic intelligence to ensure your agents take the right action on the right data. In this demo, we show that the OpenMetadata AI SDK provides an easier way to build AI applications that leverage the OpenMetadata semantic metadata graph to understand data meaning and business context, achieving accurate results. Key takeaways: ✅ Building agents is easy with the OpenMetadata Semantic Intelligence Platform with no external dependencies 🎯 Develop any type of agent that leverages the intelligence in the OpenMetadata platform 💡 Simplify data management tasks that normally take hours and days down to minutes 🎥Watch here: buff.ly/h1bev5d #DataGovernance #AI #DataManagement #OpenMetadata #Collate
OpenMetadata tweet media
English
0
1
6
175
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
🚀 AI SDK is part of the 1.12 release of Collate Avoid the common struggles of building AI applications that automate work for your data practitioners, especially for agents that need to understand data meaning and business context. AI SDK streamlines agent development in Python, Java, and TypeScript by providing a self-contained agent framework that leverages Collate's semantic intelligence to ensure your agents take the right action on the right data. In this demo, we show that the Collate AI SDK provides an easier way to build AI applications that leverage the Collate semantic metadata graph to understand data meaning and business context, achieving accurate results. Key takeaways: ✅ Building agents is easy with the Collate Semantic Intelligence Platform with no external dependencies 🎯 Develop any type of agent that leverages the intelligence in the Collate platform 💡 Simplify data management tasks that normally take hours and days down to minutes 🎥Watch here: youtu.be/BwRLKZ8QkT4 #DataGovernance #AI #DataManagement #OpenMetadata #Collate
YouTube video
YouTube
CollateData tweet media
English
0
2
3
76
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
Most platforms say "AI-powered" but what they really mean is a complex setup for a black box you can't see inside or adjust. Collate AI Studio is different. You get a visual interface for building and customizing AI agents. No code. No waiting for engineering. Just define, configure, and deploy. Here's what that actually looks like: 🎯 Customize out-of-the-box agents (documentation, quality, etc.) with your own prompts and styles 🎯 Build completely new agents for your specific use cases. The demo shows a GDPR compliance agent that validates lineage and handles data deletion requests 🎯 Control exactly what each agent can do with bot permissions that work like service accounts 🎯 Build applications with Collate AI SDK in Python, Java, or TypeScript to integrate agents into your existing workflows The agents understand your actual data because they use Collate's semantic metadata graph, not just generic LLM training data. So when you ask about GDPR compliance, the answer references your real tables, columns, and lineage relationships. Worth a look if you're tired of AI that sounds smart but doesn't actually know your business. 🎥Watch and see: youtu.be/bXvj25GBDT4 #DataGovernance #AI #DataManagement #Automation
YouTube video
YouTube
CollateData tweet media
English
0
2
4
228
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
AI hallucinations aren't a model problem. They're a semantic chaos problem. When your data lacks unified meaning, AI agents make things up. Every time. Our new white paper shows how leading enterprises are solving this by building semantic foundations that give AI the context it needs to be trustworthy. Learn how to: → Build an enterprise knowledge graph for consistent AI understanding → Automate policy enforcement across distributed data estates → Create AI-ready data products through metadata-driven workflows → Establish semantic systems of record for human and AI workflows Download now → getcollate.io/resources/whit… #AITrust #DataGovernance #SemanticIntelligence
CollateData tweet media
English
0
2
4
62
Harsha Chintalapani retweetledi
Golden State Warriors
Golden State Warriors@warriors·
Every angle of THE SHOT. 38 Feet Deep: The Greatest Regular-Season Game in NBA History premieres TODAY at 12 pm PT. 🎬 youtu.be/QiyE4LiKihQ
YouTube video
YouTube
English
12
167
1.2K
43.9K
Harsha Chintalapani retweetledi
OpenMetadata
OpenMetadata@open_metadata·
Announcing OpenMetadata 1.12: Standardize Quality Rules, Simplify Deployment, Embrace Open Standards 🚀 How many different ways has your organization written the same data quality test? We've released powerful capabilities that centralize test logic, eliminate Airflow complexity, and embrace industry standards—making OpenMetadata more accessible, standardized, and interoperable. Highlights: 📚 Data Quality Test Library - Define SQL-based test templates with parameters once, apply them everywhere without rewriting queries. Admins create reusable templates through a GUI; users apply them through simple forms—no YAML required. When "ARR must exceed $10k" means the same thing across every table, governance becomes achievable. Ad-hoc testing out, standardized enforcement in. ☸️ Kubernetes Orchestrator - Deploy OpenMetadata without Airflow dependency using Kubernetes as the native scheduler. Infrastructure simplified from four components to three (application server, database, search index). Leverage clusters you already run, eliminate deployment friction. Cloud-native deployment, simpler operations. 🔗 Open Standards Support - Import and export ODCS 3.1 data contracts. Ingest OpenLineage metadata from Flink, Spark, Airflow, and other compatible systems. OpenMetadata's richer semantic model remains a superset while enabling interoperability. Best-of-breed tools, no vendor lock-in. Even more in OpenMetadata 1.12: Human & AI Audit Logs with six-month retention • Column Bulk Operations for unified governance across asset types • Lineage improvements with column filtering, edge highlighting, and faster SQL parsing • Explore Page Sidebar for quick metadata access • Metadata Exporter for change tracking • Data Contracts at Data Product level • New connectors: StarRocks, SFTP, Redshift Serverless Collate 1.12 managed service additions: AI Studio for agent customization • Collate AI SDK for programmatic access • Auto Classification with Custom Recognizers • Data Diff Column/Row Analysis • GitHub Metadata Sync • AskCollate enhancements with MS Teams • Microsoft Fabric, Dremio, and MuleSoft connectors Ready to standardize quality and simplify deployment? Read the full launch blog: buff.ly/sHYY6sC ⭐ Star the OpenMetadata GitHub repo to support the project! #DataQuality #OpenSource #DataGovernance #Kubernetes #OpenMetadata #Collate
OpenMetadata tweet media
English
0
3
5
370
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
AI agents don’t fail because the model is weak.' They fail because your data has structure… but not shared meaning. “Regional revenue” means one thing to Finance and another to Sales. An LLM will confidently pick one and move on. That’s how AI pilots stall. Today, we’re launching Collate’s Semantic Intelligence Graph. Built on OpenMetadata, it transforms metadata into a semantic graph that makes business meaning machine-readable and reusable across your stack. New capabilities: 🤖 AI Studio – A suite of AI agents for data quality, governance, documentation, and SQL. Tune them to your standards or build new agents grounded in your semantic graph. 🔌 AI SDK – Invoke semantic agents from your custom AI applications built on Python, Java, or TypeScript with no external dependencies, or integrate with external systems like GitHub, CI/CD pipelines, and n8n workflows. 🌐 Open Standards – OpenLineage, Open Data Contract Standard (ODCS) The impact: • Fewer hallucinated answers • Faster AI deployment • Governance before bad data ships • Clear auditability • Less engineering rework This is the shift from AI experimentation to production-ready, trusted AI systems. Full announcement: getcollate.io/blog/collate-l… March 5 webinar: datasciconnect.com/events/webinar… #SemanticIntelligence #EnterpriseAI #AIAgents #DataGovernance #AITrust
CollateData tweet media
English
1
2
3
153
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
Jo Perez is back with his third Data30 episode of the year, and he's talking about "Quantifying Platform ROI." 🎥Watch here: buff.ly/Qph8WuM Quantifying Platform ROI — Beyond “Faster Discovery” “Show me the business impact” — the question every data leader faces when investing in infrastructure. In less than 30 minutes: 💰 Real ROI Metrics: Discover how to quantify the business value of metadata intelligence beyond soft benefits like “improved discovery” — from governance hours saved to compliance risk eliminated ⚡ The TCO Framework: Learn how leading organizations measure platform ROI across governance automation, policy enforcement time reduction, and metadata drift elimination Essential for leaders building business cases and justifying platform investments to executive stakeholders. #Data30 #DataGovernance #AI #DataCatalog #dataengineering #openmetadata
CollateData tweet media
English
0
3
3
46
Harsha Chintalapani retweetledi
CollateData
CollateData@CollateData·
Please join us next Wednesday, 2/25 @ 9 AM PST, for the OpenMetadata Monthly Community Meeting as we present "Building Semantically Intelligent Agents with AI Studio!" Join us as we introduce Collate’s AI Studio, enabling enterprises to build, deploy, customize, and tune AI agents to their unique data environments. Plus, our community members Alejandro Aboy and Efran Hesami will be sharing a new OpenMetadata demo with MCP! meetup.com/openmetadata-m…
English
0
2
4
69