Diffbot 🤖

2.1K posts

Diffbot 🤖 banner
Diffbot 🤖

Diffbot 🤖

@diffbot

Never write another web scraper. Diffbot structures information from the web, so you don't have to.

The Future Tham gia Eylül 2009
7.7K Đang theo dõi8.1K Người theo dõi
Diffbot 🤖
Diffbot 🤖@diffbot·
@dambuildshit Step 1: Build a free DNS provider Step 2: Collect tolls for access to sites behind their DNS Step 3: Build a crawler that doesn't need to pay their own toll Step 4: Profit
English
0
0
1
10
Diffbot 🤖 đã retweet
Massive
Massive@joinmassive·
The web isn't a database. @diffbot makes it one. 10B+ entities and 1T facts extracted from 60B+ pages, rebuilt every 4-5 days. DuckDuckGo, Snapchat, and Dow Jones run on it. Massive powers the proxy infra behind their continuous crawl.
English
1
1
3
178
Diffbot 🤖 đã retweet
Cheng Lou
Cheng Lou@_chenglou·
Ever wondered what your white name should have been? Introducing: whatismywhitename.com Upload a picture of you, and let the puppy guess your name! Let's test out nominative determinism 🫡 (Immigrants who named themselves will correlate more highly. Give us feedback plz) Our thanks to: - @modal for their generous credits toward training this meme model - @diffbot for the clean, diverse dataset! - @leannch86920 for the training research! - Everyone NOT named David (biggest & noisiest dataset ever)
English
7
8
62
6.7K
Diffbot 🤖
Diffbot 🤖@diffbot·
@devanshu_twt Sorry! It’s not ideal but it’s the easiest way to weed out 99% of abusers. When the product makes it easy to crawl the web, you get a lot of bad actors. Still thinking of a better way to solve this!
English
0
0
1
12
Devanshu
Devanshu@devanshu_twt·
@diffbot Ok, it finally worked with clg email address
English
1
0
0
18
Devanshu
Devanshu@devanshu_twt·
Hey guys how am I suppose to sign up on @diffbot? It keep asking for a valid work email. Is there no way to sign up using personal email?
English
1
0
0
21
Diffbot 🤖
Diffbot 🤖@diffbot·
@groby Sorry for the late reply (and happy new years!) It's not on the immediate horizon, but implementing a credit balance model with a low minimum is something we've discussed. I personally prefer it. Would you mind emailing me at jerome[@]diffbot?
English
0
0
0
18
Rachel Blum
Rachel Blum@groby·
@diffbot It's a bit more than I was planning, but yes, I'll be able to pile up a few hobby projects to squeeze them into a $50 months. (I prefer the extra credit model *because* my interests are bursty, but I also get it's not exactly your core audience :)
English
1
0
0
22
Rachel Blum
Rachel Blum@groby·
One wish for 2026: If you count usage in credits/tokens & you offer several subscription layers with vastly different prices, *please* allow folks to buy extra credits. (@diffbot , today - would love to use it, willing to spend $20, but $0 to $229 is a bit much of a jump)
English
1
0
0
69
Diffbot 🤖 đã retweet
Jason Grad 🇺🇦
Jason Grad 🇺🇦@mrjasongrad·
State of E-commerce Data Providers - Q4 2025 E-commerce runs on constant measurement: prices, promos, availability, seller changes, and "what the shelf actually looks like" across retailers and marketplaces. The challenge is stable collection at scale, retries when sites break, anti-bot evasion, clean geo signals, and then turning messy HTML into usable structured data. In preparation for the holiday season, we mapped the landscape of e-commerce data providers: Competitive intel + digital shelf: @dataweavein, @Price2Spy, @bigdataNODE, @Profitero, @WiserInc Marketplace intelligence + data: @junglescout, @H10Software, @datahawkco, @SellerSprite_EN Trade, Supply Chain, Imports / Exports: @Trademo1, @ImportYeti, @datamyne Scraper APIs & Extraction Platforms: @zytedata, @diffbot, @Stratalis, (AutoScraping handle?), @serpapi Managed Data Extraction & Services: @groupBWT, @Data_Ox, @epctex, @MrScraper_ Retail Media & Ad Platforms: @Pacvue, @PerpetuaLabs, @Teikametrics Network & runtime infra for e-com scraping: @playwrightweb, Puppeteer, @browserless
Jason Grad 🇺🇦 tweet media
English
0
1
0
381
Diffbot 🤖
Diffbot 🤖@diffbot·
@groby Wish granted. Will a $50 starting plan work?
English
1
0
0
16
Rachel Blum
Rachel Blum@groby·
Yes, I know, it adds engineering overhead and might open doors to abuse if not carefully planned. But it's Christmas, I get to make a wish, right? :)
English
1
0
0
53
Diffbot 🤖 đã retweet
Matthew Cassinelli
Matthew Cassinelli@mattcassinelli·
YouTube, TikTok, Mastodon, & Threads are mostly there but need optimizing. Diffbot goes incredibly far with articles & that’s also moving along well. Reddit & Bluesky are readily available but I haven’t spent the time. X is finished by the endpoint gets rate limited 😞
Matthew Cassinelli@mattcassinelli

I am in love with scraping using Shortcuts. I have about 8 sets of shortcuts for everyday social media sites that I'm developing in tandem. I'll be releasing them as I finish them – thread starts here.

English
1
2
8
2.9K
Diffbot 🤖
Diffbot 🤖@diffbot·
A datacenter story...
English
0
1
4
776
Diffbot 🤖 đã retweet
Unstructured
Unstructured@UnstructuredIO·
San Diego developers, join us and our technical partners @neo4j, @Intuit, Eyepop.ai, @Replit , and @diffbot at our HackNight next week!
Unstructured@UnstructuredIO

We're excited to join @neo4j , @Intuit, EyePop.ai, @Replit and @diffbot as technical partners for the upcoming Startup San Diego - FirstWave Innovator HackNight happening Wednesday, February 19th at the Intuit San Diego Campus. This is going to be an epic night where 100+ developers will come together to create innovative solutions for five select startups. 🚀 Join as a developer for free or get tickets: lu.ma/4brcg3lz

English
1
4
7
2K
Diffbot 🤖
Diffbot 🤖@diffbot·
#Perplexity Sonar Pro API launched last week as the best performing model on factuality. 24 hours later, it's the 2nd best performing model (and it's not because of #DeepSeek). Why? 👇
English
1
0
2
560