Martin Cervantes
55 posts

Martin Cervantes
@MartCervt
Senior Data Engineer → SaaS Founder 🚀 Building https://t.co/Gcz6JXC85E | Extract tables from PDFs without the copy-paste nightmare
Katılım Şubat 2022
32 Takip Edilen10 Takipçiler

@elgermerlo dataextractio.com
A PDFs table extractor which I'm aiming to integrated in Gmail and Outlook for future releases.
English

@NoahKingJr Really stressfull 🫠, what works for me is to start an interviewer mode to give enogh context and see what is trying to do Claude.
English

Try it here → dataextractio.com
Drop feedback in replies! What scanned documents are you struggling with?
English

Just shipped OCR to Data Extractio after hours debugging coordinate systems 🔥
Scanned PDFs → Spatial extraction → Table selection. All for 1 credit per PDF (not per page!)
Tested on invoices, receipts, bank statements. Works with JPG/PNG too.
Free to try (15 credits) — link in replies 👇
#buildinpublic
English

✅ worked on the next major feature
✅ created 7 new tasks for backlog (including 2 bugs)
✅ considering A/B testing for paywall CTA
an important thing to consider about your mobile app is not only the position of your paywall CTA but also the copy and overall visuals
I'm going to A/B test between copies and visuals for
- upgrade
- support
- new features
- remove ads
will share my journey as I iterate on all of these things
🌿 this was day 121 of 365
👥 followers: 1495 (+16)

English

@picoito Congratulations!! Any strategy you use to get it?
English

HOLY FUCKING SHIT!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
IT HAPPENED!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
FIRST INTERNET MONEY!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
IM SO HAPPY!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Picoito@picoito
iOS app live!!!!!! 🎊🎊🎊🎊 First time adding subscriptions I was afraid they'd push back, but I got lucky I guess! Then it honestly took me more than 1 hour to figure out how to do offer codes...
English

OCRmyPDF is great, but its text layer extraction often breaks table structures in scanned PDFs. 📉
I'm planning to add a restructuring logic for text layer. Check how the Table Preview is not respecting the columns separators.
Any tips on handling complex 2D structures?
🛠️ Building: dataextractio.com
#BuildInPublic #OCR

English

@Samaytwt I don't think so but you can male it better with superpowers plugging
English

boys WTF!!
two new subscribers yesterday I didn't even know

Lukasz@woocassh
Btw the streak was extended 🚀 we got the 3rd sale, 3rd day in a row 🥹 this one was an annual sub!
English

I'm choosing OCRmyPDF over Docling for OCR even though Docling is "better." after a detailed investigation.
Why: Docling is fully automated (no coordinate API). Would kill my interactive table selection + template system = my entire competitive moat.
Sometimes worse tech that preserves differentiation > better tech that commoditizes you.
Building: dataextractio.com
#buildinpublic
English























