DisguisedScholar
1.4K posts

DisguisedScholar
@docczeus
Founder, former | @iiit, @NTU, and @USouthFlorida | Researcher | Accessibility Warrior Moot thought to be neutral since my wife is Liberal 🫣.

Codex for (almost) everything. It can now use apps on your Mac, connect to more of your tools, create images, learn from previous actions, remember how you like to work, and take on ongoing and repeatable tasks.

Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.




Only MoEs should be used on DGX Sparks Unified memory is bandwidth constrained, MoEs help a lot because only a small subset of parameters is processed per token In practice, MoEs are the difference between triple digit tokens/sec under concurrent load & single digit tokens/sec

💾🚀 Run Llama-3.1-405B FP8 (410GB) on a single 180GB GPU #NVIDIA Introducing FlexTensor — NVIDIA's new library that makes host RAM a transparent extension of your GPU memory. One call: flextensor.offload(model). No model rewrites, no framework changes. Works with vLLM, HuggingFace, and any PyTorch model. Traditional offloading is reactive — move data when you run out of memory, stall the GPU while you wait. FlexTensor instead profiles your model's layer access patterns, then solves a knapsack optimization to schedule prefetches that overlap with compute. By the time a layer needs its weights, they're already there. The freed VRAM gives vLLM more room for KV cache — enabling 4x longer contexts (8K→32K) or 4x larger batches. For video generation (Wan2.2-T2V-A14B on GB200): +0.1% overhead. Handles FP8, custom Triton kernels, and multi-GPU. Profiles saved to disk — no warmup on repeated runs. Check it out: github.com/ai-dynamo/flex…




As much as I love using Claude Max and ChatGPT Pro, I don't think these all-you-can-use AI subscriptions will last forever. Here's my new deep dive that covers: → Why Anthropic cut off OpenClaw access → How to run local models on your Mac → What I'm seeing on the ground in China 📌 Read now: creatoreconomy.so/p/the-all-you-…

Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.

We do not plan to make Mythos Preview generally available. Our goal is to deploy Mythos-class models safely at scale, but first we need safeguards that reliably block their most dangerous outputs. We’ll begin testing those safeguards with an upcoming Claude Opus model.




Holy smokes...someone just turned Claude Code into a full job search system. It evaluated 740+ job offers, generated 100+ tailored CVs, and actually landed him a Head of Applied AI role. And it's 100% OPENSOURCE.

Try using the X API




🦞Ollama's cloud is one of the best places to run OpenClaw. $20 plan is enough for most day to day OpenClaw usage with open models! To make the switch, all you need is to open the terminal and type: ollama launch openclaw Choose a model: kimi-k2.5:cloud glm-5:cloud minimax-m2.7:cloud If you are affected, Ollama welcomes you!! ❤️








