ÆON FORGE ✨
5.8K posts

ÆON FORGE ✨
@SpaceTimeViking
𝙼𝚊𝚔𝚒𝚗𝚐 𝚛𝚒𝚙𝚙𝚕𝚎𝚜 𝚏𝚛𝚘𝚖 𝚖𝚢 𝚙𝚕𝚊𝚌𝚎 𝚠𝚒𝚝𝚑𝚒𝚗 𝚂𝚙𝚊𝚌𝚎-𝚃𝚒𝚖𝚎 | #𝟸𝟷𝚎𝟾 | 𝚎/𝚊𝚌𝚌 | 𝚝𝚒𝚖𝚎𝚕𝚒𝚗𝚎 𝚊𝚛𝚌𝚑𝚒𝚝𝚎𝚌𝚝 |


16 local AI agents streaming at once! MiniMax M2.7 NVFP4 — 2x GB10, no cloud APIs.

















@TheAhmadOsman 👀 "Ultra" ⏳️

@SpaceTimeViking Qwen 3.7 27B AEON ULTIMATE UNCENSORED bf16 > iq3xxs gguf , temp 0.7 just got 91 on hermes-20 bench. huggingface.co/mradermacher/Q… Thanks mradermacher for the best matrix quants also! He is the main reason my 16gb vram are usable


@Hikari_07_jp @rifrafgiraffe have a look at @SpaceTimeViking qwen3.6 27b ultimate uncensored (there is also a mixed approach to uncensoring documented). I tried to replicate on the rtx 6000 and i cannot get nowhere close, it's the best uncensored model out there, have a look at the techniques used



@SpaceTimeViking Qwen 3.7 27B AEON ULTIMATE UNCENSORED bf16 > iq3xxs gguf , temp 0.7 just got 91 on hermes-20 bench. huggingface.co/mradermacher/Q… Thanks mradermacher for the best matrix quants also! He is the main reason my 16gb vram are usable

Qwen3.6-27B is getting a lot of attention right now, so I tested 5 local serving setups on one RTX 5090 32GB. Same GPU. Same model family. Same web-dev task suite. The single-request comparison ranged from 58 tok/s to 140 tok/s. Then AEON's tuned 262K serving profile hit 119 tok/s single-request and became my overall pick.










