🚀🚀 Super excited to share the latest benchmark results for our quantized BGE models.
A few weeks ago, these models were introduced with the aim of enhancing performance and efficiency for generating embeddings. And we've now conducted thorough comparisons between running PyTorch SentenceTransformers vs. our DeepSparse-optimized models on both a 10-core laptop and a 16-core AWS instance.
The benchmarks have yielded significant improvements in processing speed. For example, running the bge-small quantized model on the 10-core laptop, achieves up to a 3X increase in speedup. What's even better, is that when tested on a 16-core AWS instance, these models achieved up to a 5X improvement.
🤗 Updated model cards:
bge-small-quant: huggingface.co/zeroshot/bge-s… 6K+ downloads
bge-base-quant: huggingface.co/zeroshot/bge-b… 2K+ downloads
bge-large-quant: huggingface.co/zeroshot/bge-l… 2K+ downloads (#1 model for STS datasets on the MTEB leaderboard)
Don't forget to check out the DeepSparse repo github.com/neuralmagic/de… for more information on benchmarking and running these models on the MTEB leaderboard. 💥
cc @neuralmagic
I love the #ChatGPT Cheat Sheet by Ricky Costa (@Quantum_Stat)
which includes
🔹NLP Tasks
🔹Code
🔹Structured Output Styles
🔹Unstructured Output Styles
🔹Media Types
🔹Meta ChatGPT
🔹Expert Prompting
Get your hands on this amazing resource at:i.mtr.cool/ehyhxpfexx
Excited to announce that our Web AI Agent
@MultiON_AI is now 10X faster than human finger speeds 🚀🚀
Here's our very first SloMo video shot in 0.2x 🔥:
Watch it in audio 🔊 to see the difference!
You can choose for yourself between human finger typing or our AI agent which is better 🤩!
⚡IT HAPPENED!⚡
There's a new state-of-the-art sentence embeddings model for the semantic textual similarity task on Hugging Face's MTEB leaderboard 🤗!
Bge-large-en-v1.5-quant was the model I quantized in less than an hour using a single CLI command using Neural Magic's open
Check the image below for an example of what I'm discussing 👇 We are soon releasing a notebook with an end-to-end example for anyone to replicate the compressed bge models which achieve great accuracy results on the MTEB Leaderboard.
⚡Getting to Know the NPZ file format to Compress BGE Embedding Models ⚡
For One-Shot Quantization (INT8), Sparsify relies on the .npz format for data storage, a file format rooted in the mighty NumPy library.
🎂It's LangChain's First Birthday!
today, we're feeling especially thankful for our community.
some reflections and highlights from our first year building together. thank you 🙏🏽
💚 The LangChain Team
blog.langchain.dev/langchains-fir…
source library Sparsify! Not only is it ONNX and INT8 quantized (faster and lighter) but is able to run on CPUs using DeepSparse! 💥
cc @neuralmagic
Model: huggingface.co/zeroshot/bge-l…
Exciting News! 🚀 DeepSparse is now integrated with @langchain , opening up a world of possibilities in Generative AI on CPUs. Langchain, known for its innovative design paradigms for large language model (LLM) applications, was often constrained by expensive APIs or cumbersome GPUs.
But with Neural Magic's DeepSparse integration, developers can now accelerate their models on CPU hardware, making it a breeze to create powerful Langchain applications.
Langchain Doc link: python.langchain.com/docs/integrati…
DeepSparse Langchain Blog: neuralmagic.com/blog/building-…
cc @hwchase17@neuralmagic
🌟First, want to thank everyone for pushing this model past 1,000 downloads in only a few days!! Additionally, I added bge-base models to MTEB.
Most importantly, code snippets were added for running inference in the model cards for everyone to try out!
huggingface.co/zeroshot/bge-s…