Diffbot 🤖

2.1K posts

Diffbot 🤖 banner
Diffbot 🤖

Diffbot 🤖

@diffbot

Never write another web scraper. Diffbot structures information from the web, so you don't have to.

The Future Katılım Eylül 2009
7.7K Takip Edilen8.1K Takipçiler
Diffbot 🤖
Diffbot 🤖@diffbot·
@dambuildshit Step 1: Build a free DNS provider Step 2: Collect tolls for access to sites behind their DNS Step 3: Build a crawler that doesn't need to pay their own toll Step 4: Profit
English
0
0
1
10
Diffbot 🤖 retweetledi
Massive
Massive@joinmassive·
The web isn't a database. @diffbot makes it one. 10B+ entities and 1T facts extracted from 60B+ pages, rebuilt every 4-5 days. DuckDuckGo, Snapchat, and Dow Jones run on it. Massive powers the proxy infra behind their continuous crawl.
English
1
1
3
178
Diffbot 🤖 retweetledi
Cheng Lou
Cheng Lou@_chenglou·
Ever wondered what your white name should have been? Introducing: whatismywhitename.com Upload a picture of you, and let the puppy guess your name! Let's test out nominative determinism 🫡 (Immigrants who named themselves will correlate more highly. Give us feedback plz) Our thanks to: - @modal for their generous credits toward training this meme model - @diffbot for the clean, diverse dataset! - @leannch86920 for the training research! - Everyone NOT named David (biggest & noisiest dataset ever)
English
7
8
62
6.7K
Diffbot 🤖
Diffbot 🤖@diffbot·
@devanshu_twt Sorry! It’s not ideal but it’s the easiest way to weed out 99% of abusers. When the product makes it easy to crawl the web, you get a lot of bad actors. Still thinking of a better way to solve this!
English
0
0
1
12
Devanshu
Devanshu@devanshu_twt·
@diffbot Ok, it finally worked with clg email address
English
1
0
0
18
Devanshu
Devanshu@devanshu_twt·
Hey guys how am I suppose to sign up on @diffbot? It keep asking for a valid work email. Is there no way to sign up using personal email?
English
1
0
0
21
Diffbot 🤖
Diffbot 🤖@diffbot·
@groby Sorry for the late reply (and happy new years!) It's not on the immediate horizon, but implementing a credit balance model with a low minimum is something we've discussed. I personally prefer it. Would you mind emailing me at jerome[@]diffbot?
English
0
0
0
18
Rachel Blum
Rachel Blum@groby·
@diffbot It's a bit more than I was planning, but yes, I'll be able to pile up a few hobby projects to squeeze them into a $50 months. (I prefer the extra credit model *because* my interests are bursty, but I also get it's not exactly your core audience :)
English
1
0
0
22
Rachel Blum
Rachel Blum@groby·
One wish for 2026: If you count usage in credits/tokens & you offer several subscription layers with vastly different prices, *please* allow folks to buy extra credits. (@diffbot , today - would love to use it, willing to spend $20, but $0 to $229 is a bit much of a jump)
English
1
0
0
69
Diffbot 🤖 retweetledi
Jason Grad 🇺🇦
Jason Grad 🇺🇦@mrjasongrad·
State of E-commerce Data Providers - Q4 2025 E-commerce runs on constant measurement: prices, promos, availability, seller changes, and "what the shelf actually looks like" across retailers and marketplaces. The challenge is stable collection at scale, retries when sites break, anti-bot evasion, clean geo signals, and then turning messy HTML into usable structured data. In preparation for the holiday season, we mapped the landscape of e-commerce data providers: Competitive intel + digital shelf: @dataweavein, @Price2Spy, @bigdataNODE, @Profitero, @WiserInc Marketplace intelligence + data: @junglescout, @H10Software, @datahawkco, @SellerSprite_EN Trade, Supply Chain, Imports / Exports: @Trademo1, @ImportYeti, @datamyne Scraper APIs & Extraction Platforms: @zytedata, @diffbot, @Stratalis, (AutoScraping handle?), @serpapi Managed Data Extraction & Services: @groupBWT, @Data_Ox, @epctex, @MrScraper_ Retail Media & Ad Platforms: @Pacvue, @PerpetuaLabs, @Teikametrics Network & runtime infra for e-com scraping: @playwrightweb, Puppeteer, @browserless
Jason Grad 🇺🇦 tweet media
English
0
1
0
381
Diffbot 🤖
Diffbot 🤖@diffbot·
@groby Wish granted. Will a $50 starting plan work?
English
1
0
0
16
Rachel Blum
Rachel Blum@groby·
Yes, I know, it adds engineering overhead and might open doors to abuse if not carefully planned. But it's Christmas, I get to make a wish, right? :)
English
1
0
0
53
Diffbot 🤖 retweetledi
Matthew Cassinelli
Matthew Cassinelli@mattcassinelli·
YouTube, TikTok, Mastodon, & Threads are mostly there but need optimizing. Diffbot goes incredibly far with articles & that’s also moving along well. Reddit & Bluesky are readily available but I haven’t spent the time. X is finished by the endpoint gets rate limited 😞
Matthew Cassinelli@mattcassinelli

I am in love with scraping using Shortcuts. I have about 8 sets of shortcuts for everyday social media sites that I'm developing in tandem. I'll be releasing them as I finish them – thread starts here.

English
1
2
8
2.9K
Diffbot 🤖
Diffbot 🤖@diffbot·
A datacenter story...
English
0
1
4
776
Diffbot 🤖 retweetledi
Unstructured
Unstructured@UnstructuredIO·
San Diego developers, join us and our technical partners @neo4j, @Intuit, Eyepop.ai, @Replit , and @diffbot at our HackNight next week!
Unstructured@UnstructuredIO

We're excited to join @neo4j , @Intuit, EyePop.ai, @Replit and @diffbot as technical partners for the upcoming Startup San Diego - FirstWave Innovator HackNight happening Wednesday, February 19th at the Intuit San Diego Campus. This is going to be an epic night where 100+ developers will come together to create innovative solutions for five select startups. 🚀 Join as a developer for free or get tickets: lu.ma/4brcg3lz

English
1
4
7
2K
Diffbot 🤖
Diffbot 🤖@diffbot·
#Perplexity Sonar Pro API launched last week as the best performing model on factuality. 24 hours later, it's the 2nd best performing model (and it's not because of #DeepSeek). Why? 👇
English
1
0
2
560