Andrej Baranovskij

5.3K posts

Andrej Baranovskij

@andrejusb

Sparrow Creator: Open-Source AI Doc Extraction 🚀 | ML/Oracle Dev | @katana_ml | Try: https://t.co/V0h9FMJzKb | https://t.co/nRgXgLL0mO

Katana ML 👉 Katılım Mart 2010

153 Takip Edilen6.7K Takipçiler

Sabitlenmiş Tweet

Andrej Baranovskij@andrejusb·12 Kas

We launched Katana ML katanaml.io in 2018 and now it is time to update the website, to explain where we are now and what we do with #MachineLearning, #MLOps, and #opensource 🚀🚀🚀

Katana@katana_ml

We have a new website - katanaml.io It explains what we do with ML in a simple and straightforward way. It is featuring our open source product Skipper, we are using it to run #MLOps. #MachineLearning #MLOps

English

Andrej Baranovskij retweetledi

Minghao Wu@WuMinghao_nlp·1d

@natolambert I don't know how this dude got this conclusion. As far as I know, we are going to keep cooking and open-sourcing SOTA LLMs for the community.

English

1.1K

Andrej Baranovskij@andrejusb·3d

@sophiamyang Small, but not so small :)

English

Sophia Yang, Ph.D.@sophiamyang·3d

GIF

Mistral AI for Developers@MistralDevs

🔥 Meet Mistral Small 4: One model to do it all. ⚡ 128 experts, 119B total parameters, 256k context window ⚡ Configurable Reasoning ⚡ Apache 2.0 ⚡ 40% faster, 3x more throughput Our first model to unify the capabilities of our flagship models into a single, versatile model.

ZXX

173

8.1K

Andrej Baranovskij retweetledi

vLLM@vllm_project·3d

🎉 Congrats to @MistralAI on releasing Mistral Small 4 — a 119B MoE model (6.5B active per token) that unifies instruct, reasoning, and coding in one checkpoint. Multimodal, 256K context. Day-0 support in vLLM — MLA attention backend, tool calling, and configurable reasoning mode, verified on @nvidia GPUs. 🔗 huggingface.co/mistralai/Mist…

Mistral AI for Developers@MistralDevs

English

382

28.7K

Andrej Baranovskij retweetledi

Prince Canuma@Prince_Canuma·3d

Day-0 support on MLX for Mistral Small 4🚀 Congratulations to the @MistralAI team on the release.

Mistral AI for Developers@MistralDevs

English

6.2K

Andrej Baranovskij retweetledi

𝗭𝗲𝗻 𝗠𝗮𝗴𝗻𝗲𝘁𝘀@ZenMagnets·3d

@andrejusb Was interested in content but didn't have time. Here's a summary. Will give a thumbs up on the video though.

English

176

Andrej Baranovskij retweetledi

vLLM@vllm_project·4d

vLLM Production Stack now has an end-to-end deployment guide on @OracleCloud OKE 🚀 Self-hosted LLM inference on OCI bare metal GPUs (A10, A100, H100) — from provisioning to first request. OCI deployment scripts are contributed and maintained in the official production-stack repo. Great option for teams that need full control over GPU drivers, CUDA versions, and model configs while keeping cloud elasticity. Thanks @OracleDevs!

Oracle Developers@OracleDevs

This tutorial walks you through deploying the vLLM Production Stack on OKE—from infrastructure provisioning to running your first inference request. social.ora.cl/6012hNgEp

English

6.3K

Andrej Baranovskij@andrejusb·3d

@MoonliteTechLLC Sparrow can process pdf out of the box, multipage pdf too, as well as png and jpg

English

moonliteTech@MoonliteTechLLC·3d

@andrejusb you used a png, is it able to grab from a pdf as well, or would you convert it to png first?

English

110

Andrej Baranovskij@andrejusb·4d

Qwen 3.5 Test for JSON Structured Data Extraction Quick test of the new Qwen 3.5 models on JSON structured data extraction from images. Testing and comparing results for 9B FP16, 27B Q8, and A3B 35B Q8. The 35B Q8 model wins in terms of both speed and accuracy. Test was run on MLX-VLM using a Mac Mini M4 Pro with 64GB RAM Video: youtube.com/watch?v=zCoBF1… Code: github.com/katanaml/sparr… Sparrow UI: sparrow.katanaml.io

YouTube

English

4.3K

Andrej Baranovskij@andrejusb·4d

@runsonai I think the same, yes

English

146

Thanh Pham@runsonai·4d

@andrejusb I’m guessing it shows that MoE works well with 35B. Same memory as 27B (dense) and yet faster too.

English

184

Andrej Baranovskij@andrejusb·4d

@ivanfioravanti Yes, always need to believe in success and keep pushing, this is how it works :)

English

Ivan Fioravanti ᯅ@ivanfioravanti·4d

@andrejusb Thanks Andrej and keep pushing with Sparrow, you have built something great there!

English

Ivan Fioravanti ᯅ@ivanfioravanti·4d

x.com/i/article/2033…

ZXX

123

7.1K

Andrej Baranovskij retweetledi

Igor Lessio - Robots/acc - AIFlow Labs@AIFlow_ML·6d

DGX Spark is slow as fuck

English

Andrej Baranovskij@andrejusb·5d

@ivanfioravanti LinkedIn keeps sending profile viewers info. Totally useless, what I supposed to do - contact people who are viewing my profile, or what. lol :)

English

Ivan Fioravanti ᯅ@ivanfioravanti·6d

LinkedIn is so terrible! It’s beyond cringe! Why most people love to appear so dumb publicly???

English

2.5K

Andrej Baranovskij retweetledi

Valeriy M., PhD, MBA, CQF@predict_addict·6d

Why Dividing by 5 Is Easy A simple arithmetic observation. To divide by 5, multiply by 2 and divide by 10. Example: 85 ÷ 5 Multiply by 2 → 170 Divide by 10 → 17 Why does this work? Because 5 = 10 / 2 So dividing by 5 is the same as multiplying by 2 and then dividing by 10.

English

2.4K

Andrej Baranovskij@andrejusb·6d

@julien_c @huggingface Already a pro user for long time :)

English

Julien Chaumond@julien_c·12 Mar

get PRO on @huggingface and instantly 10x your storage to 1 TB private + 10 TB public ...for $9 a month 😮 a deal this good should be illegal

English

159

29.3K

Andrej Baranovskij retweetledi

Robert Scoble@Scobleizer·12 Mar

49 years ago my dad bought an Apple II and my junior high, Hyde, in Cupertino, became one of the first schools to get an Apple II. I was one of five kids in its first computer club. 48 years ago my mom got a job building Apple II motherboards. She paid me and my brothers to help make them. Learned how to solder on them. Since then my life has always been affected by Apple. Siri was launched in my house. Was the first to buy an iPhone at Steve Jobs store in Palo Alto. Wrote two books about spatial computing because it kept buying startups I interviewed. Studied every morning for a semester with Apple cofounder @stevewoz. Great friend Andy Grignon was one of first 12 to build the iPhone. It has brought me so much magic. It is why I am still in love with new things and the people who build them today. Happy 50th!

Tim Cook@tim_cook

April 1st marks 50 years of Apple. Thank you to everyone who’s been a part of our journey. apple.com/50-years-of-th… #Apple50

English

983

83.9K

Andrej Baranovskij@andrejusb·12 Mar

Fast Large Table Extraction: Sparrow + dots.ocr to JSON Sparrow provides table processing mode. It is optimized to handle large tables, it comes with separate template script (new templates can be easily added) to process dots.ocr markdown output into structure JSON with field mapping. Video: youtube.com/watch?v=BJKCq_… Code: github.com/katanaml/sparr… Sparrow UI: sparrow.katanaml.io

YouTube

English

730

Andrej Baranovskij retweetledi

vLLM@vllm_project·7 Mar

🚀 vLLM v0.17.0 is here! 699 commits from 272 contributors (48 new!) This is a big one. Highlights: ⚡ FlashAttention 4 integration 🧠 Qwen3.5 model family with GDN (Gated Delta Networks) 🏗️ Model Runner V2 maturation: Pipeline Parallel, Decode Context Parallel, Eagle3 + CUDA graphs 🎛️ New --performance-mode flag: balanced / interactivity / throughput 💾 Weight Offloading V2 with prefetching 🔀 Elastic Expert Parallelism Milestone 2 🔧 Quantized LoRA adapters (QLoRA) now loadable directly

English

948

60.8K

Andrej Baranovskij retweetledi

Prince Canuma@Prince_Canuma·7 Mar

mlx-vlm v0.4.0 is here 🚀 New models: • Moondream3 by @vikhyatk • Phi-4-reasoning-vision by @MSFTResearch • Phi4-multimodal-instruct by @MSFTResearch • Minicpm-o-2.5 (except tts) by @OpenBMB What's new: → Full weight finetuning + ORPO h/t @ActuallyIsaak → Tool calling in server → Thinking budget support → KV cache quantization for server → Fused SDPA attention optimization → Streaming & OpenAI-compatible endpoint improvements Fixes: • Gemma3n • Qwen3-VL • Qwen3.5-MoE • Qwen3-Omni h/t @ronaldseoh • Batch inference, and more. Big shoutout to 7 new contributors this release! 🙌 Get started today: > uv pip install -U mlx-vlm Leave us a star ⭐️ github.com/Blaizzy/mlx-vl…

English

126

15.1K

Andrej Baranovskij retweetledi

Steve the Beaver@beaversteever·4 Mar

incredible that we built all this RAG and vector database stuff and it turns out that grep from 1973 works better than all that

English

182

363

8.6K

502.9K

Keşfet

@natolambert @sophiamyang @MistralAI @nvidia @OracleCloud @OracleDevs @MoonliteTechLLC @runsonai