Vinoth Chandar

1.2K posts

Vinoth Chandar

@byte_array

Founder @Onehousehq, Creator of @apachehudi, Built the World's first #DataLakehouse, Distributed/Data Systems, Linkedin, Uber, Confluent alum. (views are mine)

Присоединился Nisan 2009

233 Подписки1.8K Подписчики

Закреплённый твит

Vinoth Chandar@byte_array·20 May

🔥 Meet Quanton — the new query execution engine from Onehouse. 👍 Same Spark & SQL. 📉 At least half the cost. 📈 1.6x-3.6x better ETL price-performance 📊 2.2x-6.5x better Ingest price-performance 👉 Read the full blog here: onehouse.ai/blog/announcin… ⬇️ Download our free Spark cost analyzer tool: onehouse.ai/spark-analysis… ⁉️ How? Quanton processes only what’s actually needed. Most engines only try to go fast. Quanton goes smart. ✔️ Drop-in replacement for Hudi + Spark jobs ✔️ Compatible with Iceberg, Delta, SQL, dbt ✔️ Benchmarked thoroughly against top cloud runtimes ✔️ No rewrites. No lock-in. Just plug in & save. For Hudi users hitting support walls with your Spark provider or rising infra bills, it’s time to switch. We've got you even if you’re writing to Iceberg or Delta. Quanton boosts every open table format to its potential. Onehouse is now the most open, most cost-efficient platform for ETL. Proudly built by a small, gritty team of Davids in a sea of Goliaths. Quanton is here. It’s open. It’s fast. It’s efficient. We are actively building towards unlocking the next 30-80% efficiency gains by the end of the year. #ApacheHudi #Spark #SQL #Lakehouse #ETL #OpenData #DataEngineering #Onehouse #Quanton #DataPlatform #Infra #DataOps #CloudData #Efficiency #ApacheIceberg #DataWarehouse #DeltaLake

English

Vinoth Chandar@byte_array·12 Mar

Spark is still a $15B+ annual spend category 💰 Yet most enterprises treat Spark like a black box. 🧠 TLDR: pip install spark-analyzer Apache Spark still powers the backbone of lakehouse workloads 🏗️ Yet inside most companies, no one can clearly answer: ❓ Where does the spend actually go? ❓ Why don’t optimizations translate into real savings? ❓ Why is Spark cost so unpredictable? A huge share of this spend runs on ⚠️ slow runtimes that waste compute cycles (e.g. default EMR setups) 💸 premium platforms charging 2–3× markups for engines like Photon If you now want to do something about it : pypi.org/project/spark-…

English

659

Vinoth Chandar@byte_array·1d

Check out our fully blog here for the entire product flow: onehouse.ai/blog/announcin…

English

Vinoth Chandar@byte_array·1d

Everyone assumes open formats kill lock-in. 🔐 But permissions stuck in proprietary catalogs quietly tie you down. At Onehouse, we've been breaking these one by one: 1) Open formats → Onetable for interoperability 📊 2) Open compute → OpenEngines for Flink, Trino, Ray ⚙️ 3) Portable pipelines → Open Spark APIs on Quanton 🛠️ Last piece: Permissions. Launching OneSync Permission Translation with Azure OneLake. 🚀 Now permissions move freely: ✅ Across engines ✅ Clouds (AWS, Azure, GCP) ✅ Catalogs (Unity, Snowflake, LakeFormation) No remapping. True open data. 🏗️

English

203

Vinoth Chandar@byte_array·3d

Everyone assumes usage-based pricing in cloud data is fair and efficient. ⚖️ But it has a real problem: It can stop vendors for building faster engines. Traditional models priced on value—Oracle earned more for standout features. Now, with EMR or Databricks, bills hinge on compute usage. Customers win from compute efficiency (lower costs), but vendors lose revenue, pushing them to own the compute layer for pricing control. Sure, usage models offer flexibility, but they misalign incentives long-term. What's better? We need outcome-based pricing that rewards real value, like queries executed or data processed. 🚀📊

English

543

Vinoth Chandar@byte_array·2d

Bengaluru data engineers: Join our no-fluff meetup at the @onehousehq office on March 25th (next Wed eve) ⚙️ If you're into Spark, Hudi, Iceberg, lakehouses, or AI infra—this is for you. I'll cover: - Scaling Spark on K8s: What works, what breaks 🔧 - Next-gen lakehouse with Quanton & LakeBase 🏗️ Real talk on arch, benchmarks, tradeoffs - not a marketing event. Small group for deep chats. 📍 Onehouse Bengaluru ⏰ 4-6:30 PM IST 🚀 Register: docs.google.com/forms/d/1tQECs…

English

330

Vinoth Chandar@byte_array·13 Mar

AI agents will handle everything. No more on-call. No more ops. Except: this visual on US 101 yesterday tells a different story. Even robots need on-call.

English

112

Vinoth Chandar@byte_array·11 Mar

Still remains the fundamental challenge in large-scale data management on the data lake.

Apache Hudi@apachehudi

Operational data changes row by row. Lake storage is immutable. That mismatch is why old data lake pipelines rewrite entire partitions for small fixes, late events, or CDC updates/deletes. Apache Hudi matters because it adds two missing primitives: •row-level upserts/deletes •incremental reads of what changed That turns the lake into something you can actually operate.

English

218

Vinoth Chandar@byte_array·10 Mar

10/ Excited to finally bring this to the Azure data community. 👉 Read the launch blog : onehouse.ai/blog/bringing-… 👉 If you're running Spark or building lakehouse infra on Azure, reach out — we’d love to chat.

English

Vinoth Chandar@byte_array·10 Mar

9/ Your data stays inside your Azure VNet and ADLS, governed by your policies. No proprietary storage. No forced platform lock-in.

English

Vinoth Chandar@byte_array·10 Mar

1/ ✨ Azure just made the list. Not the list you’re thinking of. The list of clouds that Onehouse runs on. With our launch on Microsoft Azure, the only truly modular data lakehouse platform now runs across AWS, GCP, and Azure.

English

453

Vinoth Chandar@byte_array·17 Şub

11/ Enterprise AI runs on context. That context already lives in your lakehouse. If AI agents are now the primary consumers of data, the lakehouse must evolve from storage layer to serving layer. #Onehouse #ApacheHudi #ApacheIceberg #DataEngineering #EnterpriseAI #AIInfrastructure

English

Vinoth Chandar@byte_array·17 Şub

10/ What it enables: 🛡️ Keep existing lakehouse tables, formats, governance ⚡ True point lookups & selective access paths 🔎 O(N) index joins (10x–100x less shuffle) 🧠 Transactional columnar caching tied to commits (<1s latency) 🌐 NIO, event loops, HTTP/2, elastic autoscaling

English

Vinoth Chandar@byte_array·17 Şub

1/ 🔥 Today we’re announcing Onehouse’s low-latency interactive query engine. Because if AI generates most of your SQL queries, your current engine won’t scale. 🧵👇

English

685

Открыть

@onehousehq @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine