Diffbot ๐Ÿค–

2.1K posts

Diffbot ๐Ÿค– banner
Diffbot ๐Ÿค–

Diffbot ๐Ÿค–

@diffbot

Never write another web scraper. Diffbot structures information from the web, so you don't have to.

The Future ๊ฐ€์ž…์ผ Eylรผl 2009
7.7K ํŒ”๋กœ์ž‰8.1K ํŒ”๋กœ์›Œ
Diffbot ๐Ÿค–
Diffbot ๐Ÿค–@diffbotยท
@dambuildshit Step 1: Build a free DNS provider Step 2: Collect tolls for access to sites behind their DNS Step 3: Build a crawler that doesn't need to pay their own toll Step 4: Profit
English
0
0
1
10
Diffbot ๐Ÿค– ๋ฆฌํŠธ์œ—ํ•จ
Massive
Massive@joinmassiveยท
The web isn't a database. @diffbot makes it one. 10B+ entities and 1T facts extracted from 60B+ pages, rebuilt every 4-5 days. DuckDuckGo, Snapchat, and Dow Jones run on it. Massive powers the proxy infra behind their continuous crawl.
English
1
1
3
178
Diffbot ๐Ÿค– ๋ฆฌํŠธ์œ—ํ•จ
Cheng Lou
Cheng Lou@_chenglouยท
Ever wondered what your white name should have been? Introducing: whatismywhitename.com Upload a picture of you, and let the puppy guess your name! Let's test out nominative determinism ๐Ÿซก (Immigrants who named themselves will correlate more highly. Give us feedback plz) Our thanks to: - @modal for their generous credits toward training this meme model - @diffbot for the clean, diverse dataset! - @leannch86920 for the training research! - Everyone NOT named David (biggest & noisiest dataset ever)
English
7
8
62
6.7K
Diffbot ๐Ÿค–
Diffbot ๐Ÿค–@diffbotยท
@devanshu_twt Sorry! Itโ€™s not ideal but itโ€™s the easiest way to weed out 99% of abusers. When the product makes it easy to crawl the web, you get a lot of bad actors. Still thinking of a better way to solve this!
English
0
0
1
12
Devanshu
Devanshu@devanshu_twtยท
@diffbot Ok, it finally worked with clg email address
English
1
0
0
18
Devanshu
Devanshu@devanshu_twtยท
Hey guys how am I suppose to sign up on @diffbot? It keep asking for a valid work email. Is there no way to sign up using personal email?
English
1
0
0
21
Diffbot ๐Ÿค–
Diffbot ๐Ÿค–@diffbotยท
@groby Sorry for the late reply (and happy new years!) It's not on the immediate horizon, but implementing a credit balance model with a low minimum is something we've discussed. I personally prefer it. Would you mind emailing me at jerome[@]diffbot?
English
0
0
0
18
Rachel Blum
Rachel Blum@grobyยท
@diffbot It's a bit more than I was planning, but yes, I'll be able to pile up a few hobby projects to squeeze them into a $50 months. (I prefer the extra credit model *because* my interests are bursty, but I also get it's not exactly your core audience :)
English
1
0
0
22
Rachel Blum
Rachel Blum@grobyยท
One wish for 2026: If you count usage in credits/tokens & you offer several subscription layers with vastly different prices, *please* allow folks to buy extra credits. (@diffbot , today - would love to use it, willing to spend $20, but $0 to $229 is a bit much of a jump)
English
1
0
0
69
Diffbot ๐Ÿค– ๋ฆฌํŠธ์œ—ํ•จ
Jason Grad ๐Ÿ‡บ๐Ÿ‡ฆ
Jason Grad ๐Ÿ‡บ๐Ÿ‡ฆ@mrjasongradยท
State of E-commerce Data Providers - Q4 2025 E-commerce runs on constant measurement: prices, promos, availability, seller changes, and "what the shelf actually looks like" across retailers and marketplaces. The challenge is stable collection at scale, retries when sites break, anti-bot evasion, clean geo signals, and then turning messy HTML into usable structured data. In preparation for the holiday season, we mapped the landscape of e-commerce data providers: Competitive intel + digital shelf: @dataweavein, @Price2Spy, @bigdataNODE, @Profitero, @WiserInc Marketplace intelligence + data: @junglescout, @H10Software, @datahawkco, @SellerSprite_EN Trade, Supply Chain, Imports / Exports: @Trademo1, @ImportYeti, @datamyne Scraper APIs & Extraction Platforms: @zytedata, @diffbot, @Stratalis, (AutoScraping handle?), @serpapi Managed Data Extraction & Services: @groupBWT, @Data_Ox, @epctex, @MrScraper_ Retail Media & Ad Platforms: @Pacvue, @PerpetuaLabs, @Teikametrics Network & runtime infra for e-com scraping: @playwrightweb, Puppeteer, @browserless
Jason Grad ๐Ÿ‡บ๐Ÿ‡ฆ tweet media
English
0
1
0
381
Rachel Blum
Rachel Blum@grobyยท
Yes, I know, it adds engineering overhead and might open doors to abuse if not carefully planned. But it's Christmas, I get to make a wish, right? :)
English
1
0
0
53
Diffbot ๐Ÿค– ๋ฆฌํŠธ์œ—ํ•จ
Matthew Cassinelli
Matthew Cassinelli@mattcassinelliยท
YouTube, TikTok, Mastodon, & Threads are mostly there but need optimizing. Diffbot goes incredibly far with articles & thatโ€™s also moving along well. Reddit & Bluesky are readily available but I havenโ€™t spent the time. X is finished by the endpoint gets rate limited ๐Ÿ˜ž
Matthew Cassinelli@mattcassinelli

I am in love with scraping using Shortcuts. I have about 8 sets of shortcuts for everyday social media sites that I'm developing in tandem. I'll be releasing them as I finish them โ€“ thread starts here.

English
1
2
8
2.9K
Diffbot ๐Ÿค– ๋ฆฌํŠธ์œ—ํ•จ
Unstructured
Unstructured@UnstructuredIOยท
San Diego developers, join us and our technical partners @neo4j, @Intuit, Eyepop.ai, @Replit , and @diffbot at our HackNight next week!
Unstructured@UnstructuredIO

We're excited to join @neo4j , @Intuit, EyePop.ai, @Replit and @diffbot as technical partners for the upcoming Startup San Diego - FirstWave Innovator HackNight happening Wednesday, February 19th at the Intuit San Diego Campus. This is going to be an epic night where 100+ developers will come together to create innovative solutions for five select startups. ๐Ÿš€ย Join as a developer for free or get tickets: lu.ma/4brcg3lz

English
1
4
7
2K
Diffbot ๐Ÿค–
Diffbot ๐Ÿค–@diffbotยท
#Perplexity Sonar Pro API launched last week as the best performing model on factuality. 24 hours later, it's the 2nd best performing model (and it's not because of #DeepSeek). Why? ๐Ÿ‘‡
English
1
0
2
560