Sabitlenmiş Tweet
Andrei 𒅴𒂠
982 posts


using sglang
python -m sglang.launch_server
--host 0.0.0.0
--port 8000
--model-path deepseek-ai/DeepSeek-V4-Flash
--tp-size 4
--dp-size 2
--config /root/config.yaml
--enable-mixed-chunk
yaml config (config_deepseek_v4.yaml):
host: 0.0.0.0
log-level: info
tool-call-parser: deepseekv4
reasoning-parser: deepseek-v4
trust-remote-code: true
# Memory
mem-fraction-static: 0.92
context-length: 1048576 # 1M
chunked-prefill-size: 8192
# MoE
moe-runner-backend: flashinfer_mxfp4
# Observability
enable-metrics: true
enable-cache-report: true
# Batching — 8x B200
max-running-requests: 64
cuda-graph-max-bs: 128
# EAGLE 3/1/4 speculative decoding
speculative-algorithm: EAGLE
speculative-num-steps: 3
speculative-eagle-topk: 1
speculative-num-draft-tokens: 4
# Scheduling
schedule-policy: lpm
num-continuous-decode-steps: 4
# Kernel fusions
enable-mixed-chunk: true
enable-fused-qk-norm-rope: true
disable-flashinfer-autotune: true
env vars (in modal container):
SGLANG_ENABLE_SPEC_V2=1
SGLANG_ENABLE_THINKING=1
SGLANG_JIT_DEEPGEMM_PRECOMPILE=0
HF_HOME=/cache/huggingface
HF_XET_HIGH_PERFORMANCE=1
HF_HUB_ENABLE_HF_TRANSFER=1
English

3.1 pro is utterly horrible for any agentic use case and everyone knows it
News from Google@NewsFromGoogle
New stat from @vercel's AI Gateway in @BusinessInsider: Gemini 3 Flash is leading across AI models in token usage as of April. 🚀 See more stats on how developers are using our models → goo.gle/4dlBiol 📊via @BusinessInsider
English

Andrei 𒅴𒂠 retweetledi

I made a game where Sam Altman calls your pitch "pre-seed energy" and Peter Thiel offers you a term sheet
AI rewrites the whole story every play, pulled from real SF news and stories
built in couple of days for #ElevenHacks @elevenlabs and @zeddotdev hack #6
English

🇷🇴 Romania enters the big league of AI.
We’ve launched the EOI to select the consortium leader for the Black Sea AI Gigafactory, starting with 20,000 GPUs, scalable to 100,000+.
We’re seeking a strong, experienced partner.
Apply by June 14, 2026, here: bit.ly/4cug5IH
English
Andrei 𒅴𒂠 retweetledi

I really liked wispr flow but didn’t feel like paying $12/month for it, so i built a local-only open source version
it’s a native macos dictation app that runs whisper fully on-device and types directly into any app
if you use dictation, this is a simple alternative that just works, fully local
you can download the dmg and use it right away
English

@octavicristea i like the colour in last one, but is hard to read, would try a mesh gradient over the image on the text side
English
Andrei 𒅴𒂠 retweetledi

why does the most popular open-source project in GitHub history require a CS degree to set up?
that's the question I couldn't stop thinking about. so I built the answer.
Clapp AI is OpenClaw for your phone. download, sign in, agent ready. no terminal, no Docker, no Mac Mini.
tell it "send me a morning email digest at 8am." tell it "post to my socials at noon." it does it. Gmail, Slack, GitHub, Calendar, Notion, Instagram, 10,000+ integrations, not demos.
iOS now. Android coming soon.
English






