kreuzberg

88 posts

kreuzberg banner
kreuzberg

kreuzberg

@kreuzberg_dev

Document intelligence for AI engineering workflows. Kreuzberg Cloud Waitlist https://t.co/8hvgTAOdtV Discord community https://t.co/SUYsy6Ma9Q

Berlin, Germany Katılım Aralık 2025
50 Takip Edilen18 Takipçiler
kreuzberg
kreuzberg@kreuzberg_dev·
"The decision isn't about which languages to support; it's about what to build with the structured output." tree-sitter-langiage-pack GitHub: github.com/kreuzberg-dev/…
English
0
0
1
37
kreuzberg
kreuzberg@kreuzberg_dev·
In our newest article, learn why plain text chunking fails code-aware AI agents, how AST-aware chunking fixes it, and how one dependency can replace your entire parser infrastructure. @kreuzberg/why-ai-agents-need-structured-code-intelligence-and-how-to-stop-managing-parsers-4b59a44d5dc0" target="_blank" rel="nofollow noopener">medium.com/@kreuzberg/why…
kreuzberg tweet media
English
17
0
1
45
kreuzberg
kreuzberg@kreuzberg_dev·
Introducing Alef⚡️ You write a Rust library; Alef makes it usable in 16 languages with one command. Python, Node, Go, Ruby, Java, C#, PHP, Elixir, WASM, R, Kotlin, Gleam, Zig, C, Swift, Dart. It handles the full pipeline. No manual bindings or glue code. github.com/kreuzberg-dev/…
English
0
0
1
91
kreuzberg
kreuzberg@kreuzberg_dev·
kreuzberg-txtai is live 🎉 Drop-in replacement for txtai's Textractor. Swap Apache Tika + Java for Kreuzberg's Rust-powered extraction-wide range of formats, stable metadata, zero JVM. pip install kreuzberg-txtai → github.com/kreuzberg-dev/…
kreuzberg tweet media
Deutsch
0
0
0
71
kreuzberg
kreuzberg@kreuzberg_dev·
Introducing Kreuzcrawl, our high-performance web crawling engine. Built for AI agents from day one, with MCP server integration, real-time streaming, batch operations, and browser rendering for JS-heavy SPAs. 11 language bindings. One core engine.🚀 github.com/kreuzberg-dev/…
English
0
0
0
267
kreuzberg
kreuzberg@kreuzberg_dev·
@springcentral The kreuzberg-spring-ai DocumentReader handles over 100 formats, has built-in OCR for more than 80 languages, keeps headings when splitting, lets you break content down by elements, and provides detailed metadata. Everything runs locally.
English
0
0
1
32
kreuzberg
kreuzberg@kreuzberg_dev·
In this article, learn why agentic AI raises the stakes on document quality, what data readiness requires at scale, and how Kreuzberg Cloud will fill this infrastructure gap💡@kreuzberg/beyond-the-model-why-document-intelligence-is-the-next-ai-infrastructure-layer-3ca0a7d18fb9?postPublishedType=repub" target="_blank" rel="nofollow noopener">medium.com/@kreuzberg/bey…
English
1
0
2
39
kreuzberg
kreuzberg@kreuzberg_dev·
🔴 Live now: twitch.tv/namihirschfeld Want to see how @kreuzberg_dev gets built? Now's your chance. Our co-founder is streaming live- come, ask questions, and watch content intelligence take shape in real time. We'll be doing this a few times a week, so bookmark this channel ;)
English
1
0
2
100
kreuzberg
kreuzberg@kreuzberg_dev·
Our tree-sitter-language-pack (v1.6) now supports 305 languages 🔥. Agents using it can process source code across 305 languages with the same structured output. No per-language setup required. MIT licensed. Open source. GitHub: github.com/kreuzberg-dev/…
kreuzberg tweet media
English
0
0
1
19
kreuzberg
kreuzberg@kreuzberg_dev·
Flawed document extraction is one of the biggest bottlenecks in RAG and most pipelines don't see it coming. Find KreuzbergConverter in @Haystack_AI's core integrations. 91+ formats, local OCR, one component. Read more: @kreuzberg/the-haystack-converter-that-handles-91-file-formats-without-a-cloud-api-0505b51e49fb" target="_blank" rel="nofollow noopener">medium.com/@kreuzberg/the…
English
1
0
2
28
Haystack
Haystack@Haystack_AI·
Most document parsing pipelines still rely on cloud APIs, external services, or brittle format-specific libraries stitched together. @kreuzberg_dev takes a different approach: a Rust-core document intelligence engine that extracts text, tables, and metadata from 91+ file formats entirely locally. No API calls. No data leaving your infrastructure. We've now integrated it into Haystack as a converter component. Drop in KreuzbergConverter to transform PDFs, DOCX, PPTX, scanned images, emails, archives, notebooks, and more into Haystack Document objects. 🐍 pip install kreuzberg-haystack 🔗 Documentation: haystack.deepset.ai/integrations/k…
Haystack tweet media
English
4
2
8
424
kreuzberg
kreuzberg@kreuzberg_dev·
KreuzbergConverter sits at the entry point of any indexing pipeline, turning raw files into clean Haystack Documents. Tables come out as structured output, languages are detected automatically, and each document carries a quality score. pip install kreuzberg-haystack 💥
Haystack@Haystack_AI

Most document parsing pipelines still rely on cloud APIs, external services, or brittle format-specific libraries stitched together. @kreuzberg_dev takes a different approach: a Rust-core document intelligence engine that extracts text, tables, and metadata from 91+ file formats entirely locally. No API calls. No data leaving your infrastructure. We've now integrated it into Haystack as a converter component. Drop in KreuzbergConverter to transform PDFs, DOCX, PPTX, scanned images, emails, archives, notebooks, and more into Haystack Document objects. 🐍 pip install kreuzberg-haystack 🔗 Documentation: haystack.deepset.ai/integrations/k…

English
0
1
1
35
kreuzberg
kreuzberg@kreuzberg_dev·
@Haystack_AI Awesome, thank you for the shoutout! Excited to see Kreuzberg in the Haystack ecosystem- local-first document intelligence across the full format breadth.
English
0
0
2
16
kreuzberg
kreuzberg@kreuzberg_dev·
KreuzbergConverter is now part of Haystack's core integrations and is managed upstream by deepset, makers of Haystack: github.com/deepset-ai/hay…
kreuzberg tweet media
English
1
0
1
7
kreuzberg
kreuzberg@kreuzberg_dev·
Kreuzberg now has integrations with three of the most widely used frameworks for building AI applications: @llama_index, Haystack by @deepset_ai, and @crewAIInc. No matter what stack you are using, you can easily connect Kreuzberg's document intelligence engine with them.
kreuzberg tweet media
English
3
0
0
26
kreuzberg
kreuzberg@kreuzberg_dev·
If you're building RAG pipelines with LlamaIndex, two new packages give you structure-aware document ingestion out of the box. Try it out on GitHub: github.com/kreuzberg-dev/…
English
1
1
0
6