David Fields

2.5K posts

David Fields

@DavFields

Founder @readyai_ | Strong Beliefs, Loosely Held | ex @disney, @Harvard econ/incentive design $TAO

Los Angeles, CA Katılım Mart 2009

2.8K Takip Edilen3K Takipçiler

David Fields@DavFields·11h

Slowly than all at once The world is waking up to decentralized AI $TAO Congrats @tplr_ai on the incredible milestone 🫡

Jolly Green Investor 🍀@jollygreenmoney

Nothing to see here… Just Jensen Huang (CEO of the world’s most valuable company Nvidia) and Chamath discussing Bittensor $TAO 🤯

English

107

4.5K

David Fields@DavFields·1d

Our recent breakthrough with enrichment tasks on the subnet has completely opened the floodgates. We can now create structured datasets from nearly any source, from llms.txt to deep coding data. Will be sharing benchmark improvements with this coding data shortly

ReadyAI@ReadyAI_

x.com/i/article/2034…

English

1.5K

David Fields retweetledi

水镜@alexz7371·2d

If the number of agents grows explosively and each one has to crawl the web independently, the resulting compute consumption will be enormous. The hidden opportunity here is to convert “webpages → AI-readable structured data,” which avoids massive redundant compute usage. That’s why I’m bullish on what SN33 @ReadyAI_ & @DavFields is building — The Data Layer

const@const_reborn

Bittensor will be run by agents. They will feed the mining, resist the exploits, manage the fleets, build the subnets and consume the commodities

English

1.7K

David Fields@DavFields·13 Mar

The web wasn't built for AI agents. We're fixing that. First 1,000 domains live now, millions coming. Open source, decentralized, and free. Frontend coming shortly to request llms.txt for any site

ReadyAI@ReadyAI_

🚀 llms.txt are live on SN33 The llms.txt repository is now live. 🔗 github.com/afterpartyai/l… SN33 has processed the first batch with over 1,000 websites crawled, cleaned, and converted into structured llms.txt files by the subnet. Semantic summaries ready for any LLM agent, MCP server, or AI app to consume instantly. No scraping. No parsing raw HTML. Just clean, machine-readable intelligence. New batches will be pushed as the subnet keeps processing. The repo grows every week. What's in the dataset: → Structured semantic summaries per domain → Named entities: people, orgs, products, technologies, concepts → Topic classification and key themes → Deterministic O(1) lookup by domain with no index file needed → Git-friendly structure that scales to millions of domains This initial release covers ~1,000 domains as a pilot, but the pipeline scales to millions. 📍 Roadmap: 10K → 100K → 1M domains → continuous updates from new Common Crawl releases and soon from requests. 🌍 And the frontend is coming. Any domain. You request it, the subnet processes it, you get an llms.txt back. We're putting the finishing touches on the public UI and it drops soon. SN33 is becoming infrastructure. The web, made readable for machines and open to anyone, powered by decentralized infra. Star the repo. Share it. And stay close. The next drop is right around the corner.

English

2.9K

David Fields@DavFields·11 Mar

A big moment for decentralized AI Great work Templar team 🫡

templar@tplr_ai

We just completed the largest decentralised LLM pre-training run in history: Covenant-72B. Permissionless, on Bittensor subnet 3. 72B parameters. ~1.1T tokens. Commodity internet. No centralized cluster. No whitelist. Anyone with GPUs could join or leave freely. 1/n

English

1.4K

David Fields@DavFields·11 Mar

1000+ websites processed. Open-source repo Thursday. Public frontend right after. SN33 is becoming the infrastructure layer between the open web and the agent economy. Onward!

ReadyAI@ReadyAI_

👀 something new is coming We've been building and we're almost ready to show you. SN33 has been processing the web at scale, turning raw Common Crawl data into clean, AI-ready `llms.txt` files. Structured semantic summaries that any LLM agent, MCP server, or AI app can consume instantly. On Thursday we'll be releasing the Github repo where `llms.txt` files will be pushed in batches as the subnet processes them. We're starting with over 1000 websites analyzed and processed by the subnet that will grow every week. And shortly after... 🌍 We're launching a public frontend Any website. Any domain. You request it, the subnet processes it and you get a `llms.txt` back. No more raw HTML hell for AI agents. No more redundant crawling. Just clean, structured, machine-readable intelligence about any corner of the web, on demand, powered by decentralized compute. This is SN33 becoming a public utility for AI infrastructure The web, made readable for machines. At scale. Open to anyone. 🔜 More very soon. Stay tuned.

English

3.6K

David Fields@DavFields·10 Mar

@DallasAptGP If they are doing $200 mil ARR with 100k users that is $2k per user PER year or $166 per user per month. Much closer to Claude premium plan. Do you know which it is?

English

443

Barrett Linburg@DallasAptGP·10 Mar

My attorney and I were just comparing notes on AI. His comment: “HarveyAI (considered best in class) is what I used at my large firm. $2,500/seat/month. Claude CoWork (with legal plugin)…..$500/month for 5 seats (2 premium). Claude is just as good as Harvey.”

English

1.4K

208K

David Fields@DavFields·4 Mar

The generic data race is over. The teams that win the next 3 years are the ones building deep, vertical-specific pipelines that scraping can't replicate. That's exactly what we're doing @ReadyAI_ . Phase 1 is just the start.

ReadyAI@ReadyAI_

x.com/i/article/2029…

English

2.2K

David Fields retweetledi

ReadyAI@ReadyAI_·25 Şub

SN33 — Organizing the Spoken Web Our podcast conversations dataset has been downloaded over 300,000 times on HuggingFace. That demand told us something: the market is starving for structured conversation data. Written content represents a fraction of human knowledge online. The real depth lives in spoken conversation with experts explaining their craft, founders breaking down strategy, researchers debating methodology. Millions of hours of it happen in public every day across podcasts, interviews, panels, and debates. It's the highest-signal data on the internet, and almost none of it is structured, tagged, or accessible to AI agents. We've been calling it the web's dark matter. This week, we're making it visible. We're launching SN33's agentic transcription system with an autonomous pipeline that discovers, retrieves, and processes public conversations across the web at scale. It doesn't wait for input. It finds the conversations that matter, converts them into structured data, and feeds them directly into the subnet for enrichment. Every conversation enters the system tagged to a category from the start. That means category-specific task routing for miners, and more importantly, it unlocks something we've been building toward: customer-requested categories. Need every meaningful AI conversation from the last 90 days transcribed, enriched, and delivered as structured data? That's not a hypothetical. That's the infrastructure we're standing up right now. With site enrichment, we organized the written web. With agentic transcription, we're organizing the spoken web. Together, these systems are building what we think of as llms.txt for the entire web so not just pages, but conversations. Structured, categorized, and ready for the next generation of AI agents to consume. Not just what people write, but what they say. Rolling out TOMORROW February 26th.

English

4.1K

David Fields@DavFields·19 Şub

We started SN33 @ReadyAI_ with a simple thesis: the web has all the information agents need, but none of it is structured for them. Webpage Metadata v2 is the inflection point as we're not just tagging pages anymore, we're enriching entire sites. The goal is to become the largest producer of llms.txt files in the world. On testnet now, mainnet 2/23 🫡

ReadyAI@ReadyAI_

SN33 -- Enriching the Data of the World SN33 just shipped Webpage Metadata v2, and the best way to explain what we’re building is this: an llms.txt version of Common Crawl. Our partnership with Common Crawl began with the simple but daunting task of tagging web pages to make semantic web data widely available. Generating this data would break down the barriers preventing web organization. This week we are taking a giant step in broadening that goal to encapsulate AI-enabling the world wide web by launching the enrichment process for entire web sites. Search engines atomize the web by surfacing individual pages. That's great for finding individual facts. It does almost nothing to give agents the holistic information they need to actually complete tasks. Simple example: an agent searching for "best skis" gets quality-for-price rankings from individual pages. It completely misses how waist width affects your ability to float in powder, navigate tight spaces, or carve on groomed trails. That information exists across an entire site, but no one is structuring it that way. This week we shipped the technology to change that. SN33 is now enriching entire web sites, not just individual pages. Our new high-volume API pushes full sites through the subnet, collecting enriched data from tags, NER, similar pages to summarization across every page on a site, grouped together. Why llms.txt matters The llms.txt standard summarizes an entire web site's contents in a single meaningful text file. Agents and MCP tools can understand what a site contains without processing every page. It's the missing layer between the open web and the agent economy. Adoption has been stymied by one problem: nobody is generating these files at scale. There hasn't been a broad effort to create llms.txt for the whole web — until now. Once SN33 reaches tipping-point volume of enriched site data, we begin publishing llms.txt files at scale. We believe SN33 will become the largest producer of llms.txt files in the world. The demand for structured web data is already proven. Our first open-source dataset, the 5000 Podcast Conversations, has crossed 300,000+ downloads on HuggingFace. That was conversations. This is the entire open web. More open-source releases are coming. v2.28.63 is on testnet and goes mainnet February 23rd.

English

3.9K

David Fields@DavFields·7 Şub

The miner competition is heating up 🫡

gmoney.eth@gmoneyNFT

update just earned my first $1 on subnet 33. up only from here.

English

1.2K

David Fields@DavFields·5 Şub

@gmoneyNFT @bittensor Welcome @gmoneyNFT to SN33 🫡 let us know if we can help with any setup

English

789

gmoney.eth@gmoneyNFT·5 Şub

just set up my first miner on @bittensor subnet 33 with claude code. will check in a few hours to see if i made any money and then scale if i think its worth it. i'm looking for opportunities i can apply inference towards. please comment below if you have any.

English

26.4K

David Fields@DavFields·3 Şub

ReadyAI for Datasets Chutes for Inference 🤝

ReadyAI@ReadyAI_

SN33 🤝 Chutes Integration Chutes is now a the preferred inference provider in our LLM backend. Subnets powering subnets: → SN33 miners/validators can route inference through Chutes' decentralized compute → Volume and utility flow within the ecosystem, not to OpenAI → Reduces single points of failure in validation pipeline Also adding OpenRouter for model optionality and redundancy. The Bittensor flywheel works best with subnets building on each other instead of routing value to Web2 infrastructure. Testnet live. Mainnet next week. PR and more detail on v2.26.61 in Discord.

English

2.1K

David Fields@DavFields·27 Oca

SN33 Technical Paper: Elastic Protocol Surface (live now on SN33 mainnet) We've rearchitected our backend to support non-linear scaling both in volume and content diversity. What this enables: → Common Crawl-scale ingestion without pre-provisioned capacity → Domain-specific execution profiles (RE, finance, code, technical docs) → Horizontal throughput scaling based on demand → Fault-isolated containerized tasks Infrastructure as a first-class subnet primitive. Cross-subnet collaboration pathways outlined for Chutes (inference) and Ridges (code annotation). Full paper ↓

ReadyAI@ReadyAI_

x.com/i/article/2016…

English

2.4K

David Fields retweetledi

ReadyAI@ReadyAI_·22 Oca

SN33 v2.25.60 Release - Non-Deterministic Enrichment Testnet live now. Mainnet next week. Most subnet tasks are deterministic by design. Input X → Output Y. Validators score against expected output. Clean and verifiable. But this breaks down for dataset generation. There's no single "correct" way to enrich a source document with supplemental research. The value is in generating relevant data, not identical data. We solved this with persona-driven enrichment: Source document → persona prompt → contextual search queries → structured output Same 60k-character transcript through an "Opportunistic Investor" lens → queries about rezoning and distressed assets. Same transcript through a "Real Estate Lender" lens → queries about refinancing risk and sponsor strength. Both valid. Both useful. Neither "correct" in an exclusive sense. Validation shifts from "match expected output" to "verify quality within constraints"—relevance, persona alignment, search executability, structural compliance. The scaling math: → 1 source doc × 20 personas × 10 queries × 10 results = 2,000 data points → 1,000 docs = 2,000,000 data points → Add personas → output multiplies without changing source material 20+ personas deployed for Real Estate. Expanding to llms.txt for training data synthesis and M&A. Required upgrade for validators and miners. Technical paper: x.com/ReadyAI_/statu… PR: github.com/afterpartyai/b…

English

1.2K

David Fields retweetledi

ReadyAI@ReadyAI_·16 Oca

SN33 v2.24.59 is live Modular Extension System for miners: → Custom logic sandboxed from core → Extensions persist through upgrades → Failures isolated, core keeps running Automated testing on every PR: → Expanded coverage → Faster release velocity Safe experimentation + stability. Ship faster without breaking things. PR: github.com/afterpartyai/b…

English

850

David Fields@DavFields·14 Oca

New from SN33: Non-Deterministic Enrichment Tasks We solved a structural problem in subnet design: how to generate infinite valid outputs from finite inputs while maintaining verifiable quality. Same source doc + different persona = different valid queries. Infinitely scalable dataset generation. Launch: week of 1/19. Full technical paper ↓

ReadyAI@ReadyAI_

x.com/i/article/2011…

English

5.7K

David Fields@DavFields·17 Ara

@TalkingTensor 🫡

QME

157

Quas@TalkingTensor·17 Ara

Market reacting to this great news from David and the boys, up +15% today 📈 It was unfortunate what happened with the recent sell offs but I still think ReadyAI is a total sleeper subnet. Great team who will find real world use cases in 2026. $TAO

David Fields@DavFields

New Revenue Generating Partnerships for ReadyAI We're launching Named Entity Recognition (NER) as our next task type as we continue to enable the full universe of structured data functionality on the subnet. NER is one of the largest use cases for structured data in AI — extracting and classifying key information from unstructured text at scale. It's foundational infrastructure that powers everything from search to compliance to market intelligence. We're launching it with a real estate focused use case: Regulatory Radar. We process regulatory information from city council meetings, zoning decisions, permit approvals, and municipal records — extracting the entities, relationships, and signals that move real estate markets. Information that used to take teams hours to find is now structured and actionable. This is part of our AcquiOS platform, and we already have 2 enterprise customers using it: Gelt Venture Partners and Archer Equities — with a strong pipeline of additional RE firms behind them. Massive task type. Real customers using it already. More to come. Onward!

English

1.3K

David Fields@DavFields·16 Ara

@meaculpitt @resilabsai 💯

QME

169