Moez AI

1.1K posts

Moez AI banner
Moez AI

Moez AI

@wizardai_m

AI Engineer | AI Enthusiast | Urban Photographer | 🚀✨📸

New York Katılım Eylül 2022
374 Takip Edilen28 Takipçiler
Moez AI retweetledi
Daniel Han
Daniel Han@danielhanchen·
New Unsloth Studio update! 1. 10x faster via pre-compiled llama.cpp + mamba binaries 2. 6x faster, -50% less disk space installs via bun, uv 3. Studio is now in PATH + `unsloth studio update` works 4. Lots of UI UX improvements And my fav: Desktop + launch shortcuts for Studio!
Unsloth AI@UnslothAI

You don’t need to manually set LLM parameters anymore! llama.cpp uses only the context length + compute your local setup needs. Unsloth also auto-applies the correct model settings Try in Unsloth Studio - now with precompiled llama.cpp binaries. GitHub: github.com/unslothai/unsl…

English
5
15
131
11.7K
Moez AI retweetledi
DailyPapers
DailyPapers@HuggingPapers·
MinerU-Diffusion A 2.5B diffusion-based OCR model that replaces slow autoregressive decoding with parallel block-wise diffusion, achieving up to 3.2x faster inference while improving robustness on complex documents with tables, formulas, and layouts.
English
2
31
158
9.9K
Moez AI retweetledi
Wildminder
Wildminder@wildmindai·
Vibecoded TurboQuant looks really promising: - 3.25 bits, 4.9x compression - 4.25 bits, 3.8x compression Just waiting for llama.cpp to fully support this beast... I’ll hand off all simple agentic tasks toQwen3.5 27B. github.com/TheTom/turboqu…
Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English
9
14
94
10.8K
Moez AI retweetledi
DeepManim
DeepManim@manimable·
TurboQuant AI models waste massive memory on vectors. Compressing them usually adds overhead defeating the purpose. Google's new paper uses just 1 extra bit to eliminate that overhead. Result: same accuracy, way less memory. Accepted at ICLR 2026. The trick? Random rotations + a 50-year-old math theorem. Here is a deepmanim.com overview of the paper. #manim
English
0
2
1
23
Moez AI retweetledi
Hasan Toor
Hasan Toor@hasantoxr·
Finally. A native memory plugin is exactly what OpenClaw needs. A hierarchical, file-based architecture lets the agent actually structure its thoughts accurately instead of just doing similarity matching.
andy nguyen@kevinnguyendn

x.com/i/article/2036…

English
11
4
39
14.3K
Moez AI retweetledi
Jack Pertschuk
Jack Pertschuk@jack_pertschuk·
Interesting result - based on the blog looks only marginally better than the traditional product quantization at the same bits, but no need for code books and expensive memory lookups. Curious to see the e2e results on ANN bench vs PQ etc
Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English
0
1
8
1.1K
Moez AI retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
Google DeepMind has unveiled a browser powered by its Gemini 3.1 Flash-Lite model that generates entire websites in real time as users browse. Google’s Flash-Lite Browser treats the web like something an LLM can write live, not something humans must fully pre-build first. A normal site serves stored pages and templates, but this system uses Gemini 3.1 Flash-Lite to generate fresh HTML and CSS from your prompt, clicks, and navigation context almost instantly. The technical shift is simple: instead of fetching a finished page, the browser asks the model what page should exist right now, then streams that answer as interface code. That makes personalization much deeper, because the page can change for each user, each step, and each goal without keeping a huge library of prewritten screens. It also fits agentic workflows, where an AI assistant may need to create a temporary tool, dashboard, or reference page on the fly while working through a task. IMO, the catch is reliability, because once page layout and content are model outputs, bugs, hallucinations, style drift, and serving cost - all become concerns.
Google DeepMind@GoogleDeepMind

Watch how fast Gemini 3.1 Flash-Lite can generate websites. ⚡ This browser creates each page in real-time as you click, search, and navigate. Give it a try → goo.gle/4t9In1R

English
20
49
331
53K
Moez AI retweetledi
Moez AI retweetledi
plannotator
plannotator@plannotator·
Compound Planning - if you've been using plannotator consistently, then there is an opportunity to improve how your agents plan for you. We're going to introduce a skill that enables you to see your own insights and eventually will create a automated feedback loop. The point is to consistently refine and optimize the planning that works best for you. @pyrons_ thank you for the good idea - and this is a similar analysis I did when looking into insights for the quick label feature. Preview of what the MVP skill outputs
English
4
4
30
2.4K
Moez AI retweetledi
Picassio
Picassio@ocbieuvang·
I have created my own Agent Board for multiple agents working together using @badlogicgames' pi in the background. It features a fun 3D office where the agents work, and it utilizes @LakshyAAAgrawal's GEPA to auto-optimize the agent system prompts and system prompt templates
English
5
5
64
6.5K
Moez AI retweetledi
Moez AI retweetledi
plannotator
plannotator@plannotator·
Plannotator 0.15.0 is here. The War of the Code Review continues. We're fighting clankers with clankers now. • Live AI chat in code review (Claude, Codex, Pi, OpenCode). This is the mvp - a lot to refine from here. • Folder-based file viewer for easy doc reference (superpowers, specs, etc) • Browse all your previous plans • Full feature parity for the Pi.dev extension Plus resizable diff panes and various bug fixes! (11 PRs merged)
English
4
4
33
3.1K
Moez AI retweetledi
Rui Carmo ☯️
Rui Carmo ☯️@rcarmo·
People of pi, I have begun incorporating pi-autoresearch into github.com/rcarmo/piclaw. @davebcn87 did an amazing job, and it "just works" with the built-in terminal (launched inside the web UI after defining the experiment). will look at integrating it as a fully graphical UX...
Rui Carmo ☯️ tweet media
English
3
3
37
2.2K
Moez AI retweetledi
SUN YOUNG HWANG ᯅ 🇰🇷
Guys.. this model is just crazy. If you have just less than 48gb vram, just try the 8q gguf format. Feels just like opus! Tool calling is working smoothly!! Appreciate for this! (Hf and qwen!!) huggingface.co/Jackrong/Qwen3…
English
87
223
2.7K
173.5K