Ivan Chan

80 posts

Ivan Chan

@ivanchanavinah

CTO/Cofounder at @RunLocalAI (YC S24). Helping engineering teams ship better on-device AI faster and without the hassle. 🇭🇰 / 🇬🇧

Katılım Aralık 2019

524 Takip Edilen127 Takipçiler

Ivan Chan retweetledi

Ismail Salim@IssySalim·20 Şub

🔥 VLMs on mobile devices with world-facing cameras key for proactive, intelligent computing. Local/on-device inference key for real-time, private experiences. Great to see an emphasis on smaller VLMs. Excited to see where @huggingface, @moondreamai, etc. take things 🚀

Miquel Farré@micuelll

Holy shit! Did we just open-source the smallest video-LM in the world? SmolVLM2 is runnning natively on your iPhone 🚀 huggingface.co/blog/smolvlm2

English

839

Ivan Chan retweetledi

Ismail Salim@IssySalim·17 Şub

Snap's announcement about their on-device text-to-image model seems to have slipped under the radar… Apparently, it generates 1024x1024 images with quality that's comparable to cloud-oriented models like Stable Diffusion XL. But it can do that locally on an iPhone 16 Pro Max in <1.5 seconds! 🤯 Snap are planning to ship it soon to their ~450m daily active users, and I wouldn't be surprised if it's free. I wonder how all these subscription-driven, cloud-based image generation apps will respond… Announcement: newsroom.snap.com/ai-text-to-ima… Paper: arxiv.org/abs/2412.09619

English

367

Ivan Chan retweetledi

Ismail Salim@IssySalim·16 Şub

Short but sweet talk about the WebNN API: youtube.com/watch?v=FoYBWz… Def worth checking out the YouTube playlist from @jason_mayes WebAI Summit last year. It's packed with great talks! Looking forward to the next summit!

YouTube

English

301

Ivan Chan retweetledi

Ismail Salim@IssySalim·12 Şub

Awesome work from @soldni and team at @allen_ai! If you're interested in shipping on-device AI language features in your app, I highly recommend checking out their demo app to get a sense of what's possible these days on an iPhone: apps.apple.com/us/app/ai2-olm…

Ai2@allen_ai

We took our most efficient model and made an open-source iOS app📱but why? As phones get faster, more AI will happen on device. With OLMoE, researchers, developers, and users can get a feel for this future: fully private LLMs, available anytime. Learn more from @soldni👇

English

497

Ivan Chan@ivanchanavinah·26 Oca

@flat github.com/kermitt2/grobid

QME

Stephen Panaro@flat·25 Oca

Is there a “research paper → LLM text” tool? Something that doesn’t mangle formulas, weird text layouts, images for VLMs, etc

English

265

Ivan Chan@ivanchanavinah·14 Oca

@flat fantastic!

English

Stephen Panaro@flat·14 Oca

Read about it here: stephenpanaro.com/blog/modernber…

English

640

Stephen Panaro@flat·14 Oca

Two ways to run ModernBERT on Apple Neural Engine: A: Slow, inaccurate, but easy. B: Fast, accurate, but a lil’ tricky. Wrote about how to get from A to B.

English

451

Ivan Chan@ivanchanavinah·12 Oca

will it ever be feasible to run a 10M+ token prompt using a local LLM even the LLM supports it?

English

Ivan Chan@ivanchanavinah·12 Oca

@awilkinson NotebookLM

Français

115

Andrew Wilkinson@awilkinson·11 Oca

Has anyone made a simple web app where you can upload huge amounts of PDF docs/text files/images and have it convert them into a lightweight text/CSV format to maximize LLM context window?

English

152

850

209K

Ivan Chan@ivanchanavinah·12 Oca

@Andreadful_ @ivanfioravanti @cognitivecompai This is great! Finally some detailed evaluations

English

Ivan Fioravanti ᯅ@ivanfioravanti·6 Oca

Incredible difference in accuracy 4bit vs 8bit with 8B models like Dolphin3.0-Llama3.1-8B 🤯 18.4% of difference! 🤯 Tested with MLX on French wines reviews: - 4bit accuracy: 40.60% - 8bit accuracy: 59.20% 4bit is pretty bad on 8B models, no?

English

2.4K

Ivan Chan@ivanchanavinah·10 Oca

@omarsar0 I think NotebookLM does CAG, and it’s far better than any RAG solution I’ve ever used

English

elvis@omarsar0·7 Oca

Don't do RAG Proposes cache-augmented generation (CAG) to eliminate retrieval latency and minimize retrieval errors. What is CAG? CAG aims to leverage the capabilities of long-context LLMs by preloading the LLM with all relevant docs in advance and precomputing the key-value (KV) cache. The preloaded context helps the model to provide contextually accurate answers without the need for additional retrieval during runtime. When to apply CAG? It's a useful alternative to RAG for cases where the documents/knowledge for retrieval are of limited, manageable size. My thoughts: As LLMs advance in capabilities, I suspect that what we know as RAG today could change significantly either architecturally or how it's optimized. CAG is one in a growing list of developments and new ideas that have emerged recently to address limitations like poor retrieval relevancy and latency. There could also be hybrid methods that combine preloading with selective retrieval. Don't sleep on long-context LLMs. They are here to stay.

English

296

169.8K

Ivan Chan@ivanchanavinah·6 Oca

@ivanfioravanti @cognitivecompai arxiv.org/pdf/2407.09141 this might be a relevant paper!

English

101

Ivan Fioravanti ᯅ@ivanfioravanti·6 Oca

@cognitivecompai Yes and it was slightly lower than 8B that sounded strange to me. I’ll retry tomorrow with more reviews. It’s local, virtually 0 costs 😎

English

176

Ivan Chan@ivanchanavinah·23 Kas

@flat Sorry I meant *best* inference time (not highest)

English

Ivan Chan@ivanchanavinah·23 Kas

Let me get back to you once I’m with my computer :) but I think you’re right, because we do compression “sweeps” all the time. And usually, the pruning configs with the lowest accuracy have the highest inference time. So yes, it sounds like higher ratios lead to faster inference times

English

Stephen Panaro@flat·16 Eki

Lengthening Llama 3.2 1B’s context slows it down. What about speeding it up? Swept ~all quantization options on Apple Neural Engine. Takeaways: → Under 4 bit not worth it on M1 → Some bottleneck >6 bit on iPhone 15 Pro Confused by the second one, but measured twice.

English

826

Ivan Chan retweetledi

Sam@SamuelCrombie·29 Ağu

Hello, world!

English

604

Ivan Chan retweetledi

David Lieb@dflieb·29 Ağu

Everything I wish I knew about retention before building Google Photos and Bump.

Y Combinator@ycombinator

How do you actually know if you’re making something people want? One of the best ways to measure successful growth is with cohort retention, which tracks the fraction of new users that come back time and time again to use your product.

English

371

80.9K

Ivan Chan@ivanchanavinah·25 Ağu

@juliarturc this is amazing!

English

Ivan Chan retweetledi

Julia Turc@juliarturc·24 Ağu

There’s so much OSS out there, and I’m like a kid in the candy store. I want to try it all, but there are only so many hours in the day. Sometimes I just want to ask some high-level questions before deciding whether to invest in a repo. We made Code Sage to scratch that itch: fresh knowledge about OSS repos, with Perplexity-like references. DM me if you want a particular repo supported!

English

7.2K

Ivan Chan retweetledi

David Lieb@dflieb·24 Ağu

I already take @Waymo for granted, but I just found this photo from 2012 of the first time I saw one one the road in Mountain View. 12 years of persistence and hard work.

English

442

32.5K

Keşfet

@huggingface @moondreamai @jason_mayes @soldni @allen_ai @flat @awilkinson @Andreadful_