Tensorlake

270 posts

Tensorlake banner
Tensorlake

Tensorlake

@tensorlake

Scalable AI Infrastructure for Generative Models

San Francisco, CA Katılım Kasım 2023
61 Takip Edilen1.2K Takipçiler
Sabitlenmiş Tweet
Tensorlake
Tensorlake@tensorlake·
Document parsing benchmarks have been measuring the wrong thing. We tested every major parser on real enterprise documents. The results will change how you think about OCR accuracy 🧵
Tensorlake tweet media
English
4
4
14
4.7K
Tensorlake retweetledi
Diptanu Choudhury
Diptanu Choudhury@diptanu·
We’re working on Sandboxes @tensorlake. We built a high-performance file system to help agents run tool calls faster for coding and data analytics workloads. The goal was near-SSD speeds inside sandboxes. To get there, we forked Firecracker and built a custom, block-based overlay filesystem with dirty bitmap tracking for fast snapshots. Here’s a benchmark measuring raw SQLite performance across sandbox providers. Tensorlake is 1.2–1.3x faster than Vercel, 1.5–1.9x faster than Modal, 1.6–1.7x faster than E2B, and 1.8–2.2x faster than Daytona.
Diptanu Choudhury tweet media
English
6
13
93
6.3K
Tensorlake retweetledi
Diptanu Choudhury
Diptanu Choudhury@diptanu·
I called this a couple of months ago! Based on this screenshot, Anthropic is barking at the wrong tree if they are using Kubernetes for building a PAAS for agents. Reconciliation loops like k8s controllers are not designed for high throughput stateful MicroVM scheduling. No one in our business has built something like this on K8s, it was not designed for this problem. Stay tuned to my feed, we have been working on the problem of internet scale agents in sandboxes.
AprilNEA@AprilNEA

🧵 I just reverse-engineered the binaries inside Claude Code's Firecracker MicroVM and found something wild: Anthropic is building their own PaaS platform called "Antspace" (Ants + Space). It's a full deployment pipeline — hidden in plain sight inside the environment-runner binary. Here's what I found 👇

English
1
1
22
3.2K
Tensorlake retweetledi
Diptanu Choudhury
Diptanu Choudhury@diptanu·
github.com/adammiribyan/z… The part about using KVM to get clones of VMs is easy. The challenging part at the moment is cloning sandboxes 100x times across nodes. The bottleneck is around moving bytes across machines. We developed a hybrid approach of moving some data across machines directly, and some through blob stores. Also, requires tuning the network/RPC stack to move as much data the NIC allows you to.
English
7
11
83
8.1K
Tensorlake retweetledi
David Boskovic
David Boskovic@dboskovic·
don’t sleep on @tensorlake this kind of optimization leads to a lot of new possibilities with massively distributed workloads at close to bare metal costs
Diptanu Choudhury@diptanu

github.com/adammiribyan/z… The part about using KVM to get clones of VMs is easy. The challenging part at the moment is cloning sandboxes 100x times across nodes. The bottleneck is around moving bytes across machines. We developed a hybrid approach of moving some data across machines directly, and some through blob stores. Also, requires tuning the network/RPC stack to move as much data the NIC allows you to.

English
0
3
10
1.7K
Tensorlake
Tensorlake@tensorlake·
Filing taxes. Onboarding a customer. Submitting a permit application. All of them involve the same painful loop: hunt down information from 5 different sources → carefully match it to form fields → hope nothing breaks. We just automated that loop. Tensorlake now does agentic form filling, send a prompt, get back a filled PDF. Works on digital and scanned forms alike. tensorlake.ai/blog/agentic-f…
English
0
1
2
175
Tensorlake
Tensorlake@tensorlake·
Manual data entry is a productivity killer—especially when you’re dealing with complex tax forms, insurance applications, or scanned documents. We are excited to share a major update from Tensorlake: Agentic Form Filling. Unlike traditional rule-based tools, this "agentic" approach uses LLM-based reasoning to understand the meaning of form fields, not just their names. Key highlights: ✅ Works on anything: Supports both digital AcroForms and scanned/image-based PDFs. ✅ Semantic Mapping: It can map a prompt like "Applicant Name" to a field labeled "Name of Borrower" automatically. ✅ Vision-Grounded: Uses layout-aware widget detection to identify checkboxes, radio buttons, and text fields with precision. ✅ Audit-Ready: Returns a filled PDF along with detailed metadata for validation and retries. Whether you're automating tax returns (like the f1040) or high-volume enterprise documents, this is a game-changer for document AI. Check out the full breakdown and see the SDK in action here: tensorlake.ai/blog/agentic-f… #DocumentAI #GenerativeAI #Automation #LLMs #Tensorlake #AgenticWorkflows
English
0
1
4
167
Tensorlake
Tensorlake@tensorlake·
Powered by @Tensorlake's Agentic Platform, @novis_ai is the AI knowledge worker agent that takes you from research all the way to polished deliverables — end-to-end.
English
0
1
4
144
Tensorlake
Tensorlake@tensorlake·
Structured extraction schema iteration is often more expensive that it should be. Check the latest blog by Shanshan Wang to learn how you can iterate on your extraction task with Tensorlake. linkedin.com/posts/shanshan…
Tensorlake tweet media
English
0
1
4
177
Tensorlake
Tensorlake@tensorlake·
Sharing some lessons we learned in using VLMs for parsing complex tables in production. Our algorithm of detecting repetition using streaming inference and making blocks improves extraction of hard tables by almost 30-40% in average. tensorlake.ai/blog/vlm-pipel…
English
0
0
3
186
Tensorlake
Tensorlake@tensorlake·
Documents with tables spanning multiple pages is very common in financial services. Tensorlake's Document Ingestion API can now merge tables across pages. This improves chunking and structured extraction from tables. They are not Take a quarterly report from Berkshire Hathaway - the consolidated statement of income spans over multiple consecutive pages. If you select the "merge tables" option, in our OCR API, the tables in page 5 and 6 would be merged into one! No additional heuristics or post processing are involved in your document ingestion workflow to handle long tables.
English
0
1
5
534
Tensorlake
Tensorlake@tensorlake·
Announcing Chart Extraction capability in @tensorlake's Document Ingestion API! Extracting data points from charts makes reading them more token-efficient, and agentic applications can generate and run code on the fly to transform the data based on what the user is actually trying to learn. Our financial services and life sciences customers parse tens of thousands of documents for report generation, summarization, and research. Chart extraction adds more capabilities to their applications built on TensorLake. Blog: tensorlake.ai/blog/agentic-c…
English
0
1
3
136
Tensorlake
Tensorlake@tensorlake·
built with tensorlake
Diptanu Choudhury@diptanu

Claude Agent SDK can manage your personal finance! I built an agent that can ingest credit card and bank statements, categorize expenses and track subscriptions. The agent runs on @tensorlake and can continuously ingest new statements over time, and write them to @neondatabase. It uses code sandboxes to draw charts on the fly! The code is open-source and fully hackable! You can clone, improve it and deploy the agent API on @tensorlake and UI on @vercel A 🧵 on what I learnt:

English
0
0
2
200
Tensorlake retweetledi
Diptanu Choudhury
Diptanu Choudhury@diptanu·
Claude Agent SDK can manage your personal finance! I built an agent that can ingest credit card and bank statements, categorize expenses and track subscriptions. The agent runs on @tensorlake and can continuously ingest new statements over time, and write them to @neondatabase. It uses code sandboxes to draw charts on the fly! The code is open-source and fully hackable! You can clone, improve it and deploy the agent API on @tensorlake and UI on @vercel A 🧵 on what I learnt:
English
3
2
6
1.1K
Tensorlake retweetledi
Diptanu Choudhury
Diptanu Choudhury@diptanu·
You can now build email ingestion agents on @tensorlake. Pair @SendGrid or @twilio with Tensorlake to receive emails in an agentic application, classify them, and orchestrate work.
English
1
1
6
577
Tensorlake
Tensorlake@tensorlake·
Since execution is durable, the workflow can retry and resume through failures without losing where it left off. That’s what makes long-running “agent” workflows reliable in production. Check out the cookbook: github.com/tensorlakeai/c…
English
0
0
1
128
Tensorlake
Tensorlake@tensorlake·
It’s just Python. Define the steps, add the libraries you need, and deploy it as an HTTP endpoint. Hook it up to your inbound email provider that posts inbound messages to you, and it scales without you wiring up queues, workers, or infra.
English
1
0
1
138
Tensorlake
Tensorlake@tensorlake·
Inbound email shouldn’t be a manual triage queue. With @tensorlake you can build a code-first, serverless Python agent that classifies emails, extracts structured fields, and parses attachments into structured outputs (or markdown chunks for RAG).
English
1
0
1
210