Haystack

663 posts

Haystack banner
Haystack

Haystack

@Haystack_AI

Open-source AI orchestration framework by @deepset_ai. Build context-engineered agents & RAG systems in Python. Discord for support → https://t.co/19wuHcilYP

Katılım Ağustos 2023
51 Takip Edilen2.1K Takipçiler
Sabitlenmiş Tweet
Haystack
Haystack@Haystack_AI·
One Name. One Product Family. One Look 💙 We’re unifying the Haystack ecosystem at @deepset_ai under one name and a new logo, reflecting its role as a framework, a community, and the foundation of our enterprise platform. 👉 Read the announcement: haystack.deepset.ai/blog/announcin…
Haystack tweet media
English
0
3
8
1.6K
Haystack
Haystack@Haystack_AI·
Presidio is now available as a PII detection and anonymization integration in Haystack. Use PresidioEntityExtractor to identify sensitive information like email addresses, phone numbers, names, and credit card numbers in your documents and text. Pair it with PresidioDocumentCleaner or PresidioTextCleaner to automatically redact or mask that data before it flows through your pipelines. This is essential for applications handling user data at scale - financial systems, healthcare platforms, customer support tools, or any RAG pipeline that processes documents with personal information. Configure entity types and confidence thresholds to tune detection precision for your use case. Presidio, built and maintained by @Microsoft, has become the industry standard for PII detection and is now deeply integrated into Haystack workflows. Multi-language support means your data governance works across global datasets.
Haystack tweet media
English
1
0
5
166
Haystack retweetledi
Bilge
Bilge@bilgeycl·
WHAT AN EVENT!! 🎉 Last week we hosted an unconference as @deepset_ai with @NVIDIA at @balderton office in London. 🎙️ We kicked off with two short talks. @LukawskiKacper demoed a multi-agentic post-event report creator built with @Haystack_AI and Nemotron 3 Super, @KarinSevegnani walked us through Nemotron 3's performance and benchmarks. 💬 Then we split into three discussion groups and I led a session on memory and long running agents. One of the most interesting conversations I've had in a while. We started with memory and second brain concepts. People shared how they're building personal agents by transcribing YouTube videos and creating knowledge bases following @karpathy's wiki idea. We talked RAG vs filesystems, MCPs vs skills, AI sovereignty and security. At the end of the day, you don't want to lock your whole identity to an API, privacy and freedom of choice matter. Thank you to every attendee for your energy, openness, and contributions. I'm already looking forward to coming back to London 💙 See you at Unconference #4 👀
English
3
9
36
8.6K
Haystack
Haystack@Haystack_AI·
@Haystack_AI just crossed 25,000 stars on GitHub! This number means a lot to us, but what it really represents is you. Every contributor who opened a pull request. Every community member who answered a question on Discord. Every developer who filed an issue, wrote a notebook, gave a talk, or simply built something incredible with Haystack and shared it with the world. It's a true community effort 💙 When we first started Haystack, we believed that the journey to building great AI applications should be open, composable, and community-driven. 25,000 stars later, that belief has never felt more validated. Thank you to every single one of you: contributors, users, advocates, and builders. Let’s keep going! 🚀
English
1
4
7
414
Haystack
Haystack@Haystack_AI·
Haystack pipelines need strong retrieval primitives: embedders, rerankers, extractors. Today we're announcing the native Haystack and SIE integration from @superlinked: drop-in Haystack components for every stage of the pipeline, self-hosted on 85+ open-source models.
Haystack tweet media
English
1
3
10
514
Haystack
Haystack@Haystack_AI·
Together, Thunderbolt + Haystack Platform deliver a complete sovereign AI stack. Thunderbolt and Haystack are both open-source, and our community proved you can use both tools already. Check out the integration docs! 👇 haystack.deepset.ai/integrations/t…
English
0
0
2
125
Haystack
Haystack@Haystack_AI·
⚡ Thunderbolt is trending on GitHub today - and we're proud to be @mozilla launch partner. MZLA (Mozilla's subsidiary behind Thunderbird) just launched Thunderbolt: an open-source, self-hostable AI client for enterprises and public sector organizations that need sovereignty over their AI stack. Native apps across web, desktop, and mobile.
Haystack tweet media
English
1
0
6
240
Haystack retweetledi
Stefano Fiorucci
Stefano Fiorucci@theanakin87·
local Gemma 4 agent: drop in a map, get the location, live weather, and top spots to visit put together a notebook with @googlegemma + @Haystack_AI covering the above and 1. GitHub Agent: discovers the right tools from MCP on the fly, keeping context lean (h/t @vladblagoje) ...
Stefano Fiorucci tweet media
English
3
5
37
13.4K
Haystack
Haystack@Haystack_AI·
Haystack 2.28 is here 🚀 This release makes agent tool development more flexible and document processing more precise. Many interesting updates this time, but only one highlight: 🔗 Pass Agent State Directly to Tools No more manual wiring. Tools and components can now declare a state parameter and receive the live agent State at runtime automatically — giving them full access to conversation history and context without extra connections. 💙 Big thanks to our contributors to this release! 👇 Full release notes in the next post
Haystack tweet media
English
2
2
6
237
Haystack
Haystack@Haystack_AI·
Bigger context windows don't mean better agents - they mean more competition for the model's attention. New post: context engineering for agentic systems by @LukawskiKacper. What fills the context window, why bloat hurts quality and cost, and how Haystack gives you full control over your agent harness - the infrastructure layer that keeps it all under control. haystack.deepset.ai/blog/context-e… Part 1 of a series - what would you most want us to cover next?
Haystack tweet media
English
0
2
7
603
Haystack retweetledi
Afiz ⚡️
Afiz ⚡️@itsafiz·
How to build an Agentic RAG Application in simple steps? In this thread, I’ll show you how to build an Agentic RAG with automatic web search fallback using @Haystack_AI A step-by-step guide 👇🧵
Afiz ⚡️ tweet media
English
5
6
25
7.6K
Haystack
Haystack@Haystack_AI·
New tutorial: compress your LLM's KV cache with TurboQuant + Haystack 🗜️ by @LukawskiKacper We used turboquant-vllm (a community TurboQuant implementation, based on the @GoogleResearch recent paper) hooked into HuggingFaceLocalChatGenerator, but since Haystack is provider-agnostic, anything compatible with HuggingFace, Ollama, or vLLM works just as well. 👉 haystack.deepset.ai/tutorials/49_t…
Haystack tweet media
English
0
5
26
2K
Haystack
Haystack@Haystack_AI·
MarkItDown by @Microsoft quickly became one of the most talked-about document conversion libraries in the Python ecosystem. The reason is simple: it just works. PDF, DOCX, PPTX, XLSX, HTML, images - all converted to clean Markdown, locally, with no external API calls. For RAG pipelines, that matters more than people realize. The quality of your conversions directly shapes the quality of your retrieval. Use MarkItDownConverter to convert virtually any file format into Haystack Document objects with Markdown content - standalone or wired directly into your indexing pipelines. Everything runs locally, so your data stays yours. 🐍 pip install markitdown-haystack 🔗 Documentation: haystack.deepset.ai/integrations/m…
Haystack tweet media
English
0
0
9
332
Haystack
Haystack@Haystack_AI·
A few things worth highlighting: 📄 91+ formats: office docs, images (OCR), markup, eBooks, email with attachments, archives processed recursively 🔒 Fully local: no external API dependencies ⚡ Parallel batch extraction out of the box 🧠 Rich metadata: quality scores, detected languages, keywords, table data, PDF annotations 🎛️ Flexible config: per-page extraction, token chunking, token reduction, Markdown output, and more
English
0
0
4
129
Haystack
Haystack@Haystack_AI·
Most document parsing pipelines still rely on cloud APIs, external services, or brittle format-specific libraries stitched together. @kreuzberg_dev takes a different approach: a Rust-core document intelligence engine that extracts text, tables, and metadata from 91+ file formats entirely locally. No API calls. No data leaving your infrastructure. We've now integrated it into Haystack as a converter component. Drop in KreuzbergConverter to transform PDFs, DOCX, PPTX, scanned images, emails, archives, notebooks, and more into Haystack Document objects. 🐍 pip install kreuzberg-haystack 🔗 Documentation: haystack.deepset.ai/integrations/k…
Haystack tweet media
English
4
2
8
424
Haystack
Haystack@Haystack_AI·
PII in RAG pipelines is one of those problems that's easy to ignore. A name, an email, a social security number slipping through into a vector store or an LLM prompt is the kind of thing that keeps compliance teams up at night. @tonicfakedata Textual has been tackling PII detection and transformation with transformer-based NER models that cover 46+ entity types across 50+ languages We've now integrated it into Haystack with two components. TonicTextualDocumentCleaner sanitizes documents before they ever reach your index - swapping real PII with realistic synthetic data or reversible placeholder tokens. TonicTextualEntityExtractor detects PII and attaches it as structured metadata directly onto your documents, ready for hybrid retrieval, auditing, or compliance workflows. 🐍 pip install textual-haystack 🔗 Documentation: haystack.deepset.ai/integrations/t…
Haystack tweet media
English
0
0
5
141