Saurabh Abhyankar

338 posts

Saurabh Abhyankar

@saurabha

Chief Product Officer, Strategy ($MSTR). #Bitcoin #Data

Washington, DC Katılım Şubat 2008

573 Takip Edilen481 Takipçiler

Sabitlenmiş Tweet

Saurabh Abhyankar@saurabha·23 Nis

Take a right. Go right. Bear right. Semantics Matter. Don't drown in your data lake. @Strategy Mosaic

English

168

Saurabh Abhyankar@saurabha·13h

Mark's right, domain knowledge is more valuable by the second. Both in humans and in technology. For the long term, we have to encode it into the company itself... like DNA. That's what semantic layers, knowledge graphs, ontologies, and AI memory systems are all racing to do. Semantic layers are furthest along: they generate the same SQL for the same business question, every time. Same query, same answer, same @Strategy.

Mark Cuban@mcuban

I’m coming to the conclusion that the biggest challenge for Enterprise AI, and AI in general , as of now, is that it’s still impossible to make sure that everyone gets the same answer to the same question, every time. Which is a great response to the doomers. AI doesn’t know the consequences of its output. Judgement and the ability to challenge AI output is becoming increasingly necessary, and valuable. Which makes domain knowledge more valuable by the second. Am I wrong ?

English

Saurabh Abhyankar@saurabha·13h

You're right, domain knowledge is more valuable by the second. Both in humans and in technology. For the long term, we have to encode it into the company itself... like DNA. That's what semantic layers, knowledge graphs, ontologies, and AI memory systems are all racing to do. Semantic layers are furthest along: they generate the same SQL for the same business question, every time. Same query, same answer, same @Strategy.

English

721

Mark Cuban@mcuban·16h

English

1.7K

377

5.3K

1.2M

Saurabh Abhyankar@saurabha·1d

Consider three options as an entrepreneur: 1. 1x output. You plus 10 people. 2. 10x output. You plus AI. 3. 100x output. You plus one engineer plus AI. All entrepreneurs will choose #3. This is why engineers remain valuable, and why we will see many more entrepreneurs.

Aaron Levie@levie

If you think AI replaces software engineers, here’s a quick thought experiment. Imagine you’re a life sciences company. 10 years ago you want to invest heavily in lab automation, processing data at scale, and other software. You look at the cost of doing so and realize you can’t compete with tech for as many engineers as you need, so you pare down your goals and do what you can. Every new software project has a fixed cost of a certain sized team, so you can only do so much given budgets, ability to compete for talent, and other trade offs. Now, AI comes along. And all of a sudden you have the *exact same* output tokens as the best tech companies in the world. Your engineers are using the same AI models as the tech industry, which means you have just boosted your engineering team by a some meaningful amount, while also neutralizing your differences with tech. Do you continue with your pared down approach, or do you start to hire more engineers because each engineer is 2X or 5X more capable than before? In almost every company I’m talking to, they’re doing the latter. Now extrapolate this to every bank, manufacturer, industrial company, retailer, and on and on. And extrapolate it not to just large enterprises, but also every SMB up and down the stack of these value chains. Oh, and also extrapolate this to other job functions, not just engineers. Resource scarce domains in marketing, legal, finance, design, and so on. If you’re wondering why new jobs show up because of AI this is the reason. Any other view of what happens doesn’t contemplate the variety of unmet needs there are in the economy.

English

Saurabh Abhyankar@saurabha·3d

@brandonjcarl The data governance problem was always there before AI. AI just made the consequences immediate and visible. A semantic layer doesn't fix dirty data — but it stops agents from making confidently wrong decisions on it.

English

Brandon Carl@brandonjcarl·6d

I worked on bringing all of this data together for a decade. Managed more financial services communications data than any company in the world. The issue is not the AI. It’s the governance on the data. It’s GDPR and role-based access control. It’s how you track and manage ingestion failures and retries. Or the fact that you have 100+ channels to intake and the APIs may change without warning, prompting remediation. Or perhaps the fact that Elastic changes their major version every 18 months, prompting you to spend incredible sums on reindexing. Most of all: it’s the pervasive noise. This is minimally 30 to 50% of all data at a company. It’s junk. Once you work through all of that: now focus on security, legal, procurement and model governance. Get ready to risk rate and create reports on everything. Now you’ve earned the right to do the AI. You’ll soon be working through large-scale entity resolution with constantly changing corporate directories, role histories and the like. @t_blom happy to connect and talk through. Send me a DM.

Y Combinator@ycombinator

Company Brain @t_blom Every company has critical know-how scattered across people's heads, old Slack threads, support tickets, and databases, and AI agents can't operate like that. We think every company in the world is going to need a new primitive: a living map of how the company works that turns its own artifacts into an executable skills file for AI.

English

745

Saurabh Abhyankar@saurabha·3d

@rajeshchitupe Fully agree. Semantic Layers are the answer to this - they provide the governance, business context and determinism required for AI to be both confident and correct.

English

Rajesh Chitupe@rajeshchitupe·28 Nis

SQL agents fail less because of SQL syntax and more because of weak enterprise context. The real work is schema governance, semantic layers, permission boundaries, query limits, auditability, and observability. Without that, an AI SQL agent is basically a confident analyst with production access. The model is only one layer. The trust architecture around it is the product.

English

Saurabh Abhyankar@saurabha·3d

@saen_dev Yeah, T2S fails in a number of ways. Join determination, lack of context to calculate correctly, etc.

English

Saeed Anwar@saen_dev·22 Nis

Every text-to-SQL failure I've debugged came down to the model picking the wrong table, not writing bad SQL. Schema retrieval is the unsexy problem that makes or breaks the whole pipeline.

Avi Chawla@_avichawla

Finally! A Text-to-SQL solution that actually works (open-source). When text-to-SQL fails, the real issue isn't the LLM or the prompt but schema retrieval. Consider a query like "Which publishers received royalty payments above $5,000?" To handle this, vector search can pull "publisher" and "royalty_ledger" based on semantic similarity But it can completely miss "vendor_agreement", the bridge table that connects them. The LLM writes valid SQL, but the engine still returns zero rows. This is a fundamental issue with vector-based schema retrieval on enterprise databases. A smarter approach is to treat the schema as a graph instead of a document to embed. Tables become nodes, foreign keys become edges, and join paths are discovered by walking the graph rather than matching semantics. If you want to see this in practice, QueryWeaver implements this approach. It converts the schema into a graph, and when a query comes in, it walks the structure and pulls in every bridge table the join path requires, including multi-hop chains. For instance, on the BIRD Benchmark with a superhero database expanded to 60 tables, it resolved a 5-hop query by chaining through: superpower → capability_matrix → stakeholder_registry → resource_requisition → budget_allocation Vector search found the two endpoints but missed everything in between because "stakeholder" has zero semantic link to "superpowers." Graph traversal found "stakeholder_registry" simply because it was the only road connecting the entities. It's fully open-source, and you can easily self-host it. I've shared the GitHub repo in the replies.

English

Saurabh Abhyankar@saurabha·3d

This. AI needs three APIs: "what" (the actions and data), "how" (how things are calculated and defined), and "why" (who is this user, what do they care about). A Semantic Layer provides the how and the why API. @Strategy

Aaron Levie@levie

As agents become the biggest users of software, then all software has to be available in a headless fashion. Agents won’t be using your UI, they’ll be talking to your APIs. So the question becomes what is the business model of software and this headless approach in the future? Here are a few thoughts on how everything plays out based on what we’re seeing and doing at Box, but also conversation with other platforms. 1) Seats don’t go away for *people*. Seats are still a convenient and efficient way to have a customer use technology predictably for a set of users within a baseline set of usage. The key, though, is that when the customer pays for a seat, it has to come with a set of usage of APIs on behalf of that user that the agent can use on their behalf. The user will need to be able to interact with their data and the underlying tool via any agent they work with, and an embedded amount of usage will come with the seat. I would imagine most software -Box included- will enable seats to work with their data at a relatively high volume via systems like ChatGPT, Codex, Claude, Gemini, Cursor, Copilot, Perplexity, Factory, Cogniton, et al. quite seamlessly. If you don’t do this, you’re DOA. 2) Agents may have “seats” if they are doing stateful work in the system, but they will be priced very differently than people. Seats (or the equivalent) can make sense when you have an agent that has its own workspace, stores its own data, needs a different set of permissions compared to the user, and so on. If a company wants this agent to be around for long period of time, that may very well look like another “user” in the system. Openclaw-style agents highlight what this future could look like. The only issue on pricing here is that one customer could decide to do all their work in 1 agent, and another might split it into 1,000 agents. So pricing like a human seat is nearly impossible and impractical; each company will have a different approach for this as it gets tricky perfectly trying to capture all the value within an agent seat. 3) The dominant pricing for headless use that goes above the seat allotment, or when an agent is firmly acting on their own, will be a consumption model. Many enterprises software platforms have previously operated like this with PaaS options, and agents will look like another machine user of their system. In some cases the APIs might get priced just as they did previously, but in other cases there may need to be new types of APIs that represent the work an agent would do in one go -more akin to an outcome- instead of a series of API calls. This is especially germane when the headless software also has an agentic use-case embedded within in, such as orchestrating the process within their own system via AI. Overall the growth of this usage pattern is effectively unbounded as the use-cases for agents operating on data in these systems will dramatically exceed what people do with their data and tools today. Every platform that goes headless (which will be anyone that wants to take advantage of agents) will need to adopt a model like this. Some may fight it initially but it’s an inevitably as there will always be more and more agents outside your platform than people. Overall, there’s a lot of really interesting changes left to come in software due to headless use of these systems. Early days.

English

Saurabh Abhyankar retweetledi

Phong Le@phongle·3d

The biggest Strategy in the world.

English

167

413

6.6K

317.7K

Saurabh Abhyankar@saurabha·4d

AI has a data problem. Databricks says centralize. Snowflake says centralize. Fabric says centralize. More compute. More storage. More money. Federation is faster, cheaper, and it actually works. @Strategy

English

110

Saurabh Abhyankar@saurabha·5d

AI makes great people greater. We rebranded to Strategy during the rise of GenAI. Our creative and brand teams delivered better work than any agency we've ever used — at 1% of the cost and 100x the fun. @Strategy @LaDoger @jhnebt

LaDoger 🟧@LaDoger

Using the new @Blender MCP to turn @Strategy symbol into a mesh via Claude Code 😍

English

1.6K

Saurabh Abhyankar retweetledi

Alvin Sng@alvinsng·27 Nis

The most desirable hires in tech right now: - Ex-founders going back to IC. They have the agency to just ship. No waiting for permission. - Generalist engineers who've worked across frontend, backend, and infra. End-to-end context lets them debug problems LLMs can't fix and ship anything. - Engineers turned PMs. The strict separation between roles is over. The best ones now do both. - Younger new grads living on the bleeding edge. Vibe coding side projects (in parallel), dictating into Wispr, Granola all chats, OpenClaw agents going at home, every new skill imported, every agentic tool tried the week it ships. These highly productive go-getters are maxxing value at AI-native companies. I see it at @FactoryAI and hear the same from other startups.

Andrew Ng@AndrewYNg

AI-native software engineering teams operate very differently than traditional teams. The obvious difference is that AI-native teams use coding agents to build products much faster, but this leads to many other changes in how we operate. For example, some great engineers now play broader roles than just writing code. They are partly product managers, designers, sometimes marketers. Further, small teams who work in the same office, where they can communicate face-to-face, can move incredibly quickly. Because we can now build fast, a greater fraction of time must be spent deciding what to build. To deal with this project-management bottleneck, some teams are pushing engineer:product manager (PM) some teams are pushing engineer:product manager (PM) ratios downward from, say, 8:1 to as low as 1:1. But we can do even better: If we have one PM who decides what to build and one engineer who builds it, the communication between them becomes a bottleneck. This is why the fastest-moving teams I see tend to have engineers who know how to do some product work (and, optionally, some PMs who know how to do some engineering work). When an engineer understands users and can make decisions on what to build and build it directly, they can execute incredibly quickly. I’ve seen engineers successfully expand their roles to including making product decisions, and PMs expand their roles to building software. The tech industry has more engineers than PMs, but both are promising paths. If you are an engineer, you’ll find it useful to learn some product management skills, and if you’re a PM, please learn to build! Looking beyond the product-management bottleneck, I also see bottlenecks in design, marketing, legal compliance, and much more. When we speed up coding 10x or 100x, everything else becomes slow in comparison. For example, some of my teams have built great features so quickly that the marketing organization was left scrambling to figure out how to communicate them to users — a marketing bottleneck. Or when a team can build software in a day that the legal department needs a week to review, that’s a legal compliance bottleneck. In this way, agentic coding isn’t just changing the workflow of software engineering, it’s also changing all the teams around it. When smaller, AI-enabled teams can get more done, generalists excel. Traditional companies need to pull together people from many specialties — engineering, product management, design, marketing, legal, etc. — to execute projects and create value. This has resulted in large teams of specialists who work together. But if a team of 2 persons is to get work done that require 5 different specialities, then some of those individuals must play roles outside a single speciality. In some small teams, individuals do have deep specializations. For example, one might be a great engineer and another a great PM. But they also understand the other key functions needed to move a project forward, and can jump into thinking through other kinds of problems as needed. Of course, proficiency with AI tools is a big help, since it helps us to think through problems that involve different roles. Even in a two-person team, to move fast, communication bottlenecks also must be minimized. This is why I value teams that work in the same location. Remote teams can perform well too, but the highest speed is achieved by having everyone in the room, able to communicate instantaneously to solve problems. This post focuses on AI-native teams with around 2-10 persons, but not everything can be done by a small team. I'll address the coordination of larger teams in the future. I realize these shifts to job roles are tough to navigate for many people. At the same time, I am encouraged that individuals and small teams who are willing to learn the relevant skills are now able to get far more done than was possible before. This is the golden age of learning and building! [Original text: deeplearning.ai/the-batch/issu… ]

English

555

105.4K

Saurabh Abhyankar@saurabha·6d

Great to see semantic layers becoming critical infrastructure. BUT, what happens when your data isn't all in Databricks, or your GenAI harness isn't theirs? The semantic layer you need works everywhere. @Strategy Mosaic.

Ali Ghodsi@alighodsi

𝐆𝐞𝐧𝐢𝐞 is now the most important way to do data analysis in Databricks. What's unique about it is its ability to extract semantics from your entire Lakehouse, enabling it to answer complex data questions that cripple agents without a deep data understanding. We've now added a Mobile version, added Unstructured data processing, as well as enabled it to operate on all your dashboards and notebooks. Check it out: databricks.com/blog/next-gene…

English

Saurabh Abhyankar@saurabha·27 Nis

We've had governance for employees forever. Not because we don't trust them, but because people make mistakes. AI will make mistakes too. Treating governance as optional right now is going to be a very expensive lesson. @Strategy

JER@lifeof_jer

x.com/i/article/2048…

English

149

Saurabh Abhyankar@saurabha·24 Nis

The best upgrade in the sky. Thanks @elonmusk

English

122

Saurabh Abhyankar@saurabha·24 Nis

At @Strategy we’re building self-healing software that’s on your side. So you can stop hunting defective AI.

English

117

Saurabh Abhyankar@saurabha·22 Nis

@mattturck Similarly, all the claims of all the $1B+ companies run by a handful of people. The price of that output in the future will be orders of magnitude less than what people pay today. So maybe just $1m+ companies.

English

Matt Turck@mattturck·20 Nis

One head-scratching idea that gets repeated endlessly: the new TAM for AI is the size of the human labor market, dollar-for-dollar. Many trillions! Just for like any labor automation technology in history, the price of AI services will be the marginal cost + a normal margin.

English

11.7K

Saurabh Abhyankar@saurabha·22 Nis

More data, more noise. Karpathy's compression logic applies to the data stack too: clean federated data beats sludge in a data lake every time. A semantic layer across the data that matters, not another warehouse project. @Strategy Mosaic.

Aakash Gupta@aakashgupta

Karpathy told Dwarkesh that a 1 billion parameter model, trained on clean data, could hit the intelligence of today's 1.8 trillion parameter frontier. That is a 1,800x compression claim. The math behind it is more defensible than it sounds. When researchers at frontier labs look at random samples from their training corpus, they see stock ticker symbols, broken HTML, forum spam, autogenerated gibberish. Not Wikipedia. Not the Wall Street Journal. The actual pretraining dataset is mostly noise, and the model is burning parameters to vaguely remember all of it. One estimate pegs Llama 3's information compression at 0.07 bits per token. Well-structured English carries around 1.5 bits per token of real information. The trillion-parameter model is holding a roughly 5% resolution image of the internet it trained on. So when a lab ships a 1.8 trillion parameter model, the overwhelming majority of those weights are handling rough memorization. They are compression overhead for a noisy training set, taking up capacity that could be doing reasoning instead. Karpathy's proposal is to separate the two. Build a cognitive core: a small model that contains only the algorithms for reasoning and problem-solving, stripped of encyclopedic memorization. Pair it with external memory the model queries when it needs a fact. A 1 billion parameter reasoner plus retrieval beats a 1.8 trillion parameter model trying to do both. The data already supports this direction. GPT-4o runs at roughly 200 billion parameters and outperforms the original 1.8 trillion GPT-4. Inference costs for GPT-3.5 level performance fell 280x between 2022 and 2024, driven almost entirely by smaller, cleaner, better-architected models. The trend line is pointing where Karpathy says it should. The real implication for anyone tracking the AI trade: data quality is the actual constraint. The companies winning the next phase will be the ones who figured out what to train on, and what to throw away.

English

824

Saurabh Abhyankar@saurabha·22 Nis

@levie We have two problems: legacy stacks and legacy orgs. The answer to the first is federation (not consolidation) via a semantic layer. The answer to the second is to stand up a new team with a mandate to rethink the first.

English

3.4K

Aaron Levie@levie·22 Nis

If you read this and don’t understand why it’s happening it’s an opportunity to reset your understanding of how the real world works. The real world will need a ton of help actually getting agents going in the enterprise. Companies have legacy tech stacks they need to modernize, data in tons of fragmented tools, knowledge that isn’t captured or digitized, and change management needed to actually utilize agents effectively. And they have to do all this while still running their business day-to-day, unlike startups. This is why there is so much opportunity for companies (software or services) to actually deploy agents in specific domains and workflows. This remains a big opportunity for both existing services providers but also tons of new startups as well. Every new technology wave produces a new era of consulting firms that can deliver on that technology. It’s also why the FDE model is going to be alive and well for a long time because companies will want to have their vendor actually help drive the change management and implementation for their new workflows. The people aren’t going away. Far from it.

First Squawk@FirstSquawk

OPENAI WORKING WITH CONSULTING FIRMS, INCLUDING ACCENTURE, CAPGEMINI AND PWC, TO HELP SELL CODEX TO BUSINESSES- WSJ

English

152

369

3.9K

1.4M

Saurabh Abhyankar@saurabha·21 Nis

@shreyas Because there is a big difference between 0-1 products that win vs adding long-tail features to an existing, established product with low growth potential. I'd say nearly 100% of PMs work on the latter, not the former.

English

Shreyas Doshi@shreyas·12 Eki

Consider this: if “talk to customers” is the biggest secret to product success, then why aren’t more products successful? Why are so many founders unsuccessful? What explains PMs who’ve been talking to customers 5 times a week for years, without ever making products that win?

English

258

865

252.1K

Keşfet

@Strategy @brandonjcarl @t_blom @rajeshchitupe @saen_dev @LaDoger @jhnebt @FactoryAI