Steppe Cyber Vedic Futurist🕉️☸️ 🥩🌲

2.7K posts

Steppe Cyber Vedic Futurist🕉️☸️ 🥩🌲 banner
Steppe Cyber Vedic Futurist🕉️☸️ 🥩🌲

Steppe Cyber Vedic Futurist🕉️☸️ 🥩🌲

@steppebuddha

join #kaliacc on urbit at: ~lavfun-fiplyr/kaliacc

Lhasa, Tibet Katılım Ocak 2011
2.4K Takip Edilen1.9K Takipçiler
Railway
Railway@Railway·
Google Cloud has blocked our account, making some Railway services unavailable. We have escalated this directly with Google. The Railway Platform team has since confirmed access to Google Cloud and is working on restoring access to all workloads. We have access to some of our Google Cloud–hosted infrastructure and are working to restore the rest of the service. We apologize for the disruption.
Railway@Railway

The Railway dashboard is currently unavailable, and all running Railway services are down. We're working with our upstream provider to restore service. Updates: status.railway.com

English
303
183
2.2K
1.2M
AJAC
AJAC@AJA_Cortes·
One of the oldest Christian communities in the world is in India The St. Thomas Christians of Kerala trace their founding to 52 AD, when the Apostle Thomas is said to have landed on the Malabar Coast When the Portuguese showed up in 1498, they found a Church already 1,400 years old, worshipping in Aramaic
Ramin Nasibov@RaminNasibov

What historical fact sounds fake but is true?

English
253
790
6K
398.2K
antirez
antirez@antirez·
In case you have doubts about the q2 quants inference of DS4 (I noticed many don't trust the README claims), here is it analyzing Picol source code using the pi agent.
English
12
29
284
43.8K
송준 Jun Song
송준 Jun Song@jun_song·
Macbook M6 Max with DDR6 can have 196GB unified RAM. It’s basically a portable inference server.
English
21
4
273
26.3K
Kelly Buchanan
Kelly Buchanan@ekellbuch·
Very excited to release Terminal-Bench 2.1! Coding agents are among the most economically consequential deployments of LLMs to date. As agents improve, benchmark reliability matters more. We audited TB2.0 and found and corrected issues in 28/89 tasks. 30% of the benchmark! But the rankings survived, absolute scores moved up to 12pp!
Kelly Buchanan tweet media
English
27
74
767
83.7K
Ahmad Awais
Ahmad Awais@MrAhmadAwais·
Who wants to beta test a $1/mo coding agent Go plan we’re launching at Command Code next week?
English
118
9
249
25.6K
Tibo
Tibo@thsottiaux·
What are we obviously not getting right with Codex?
English
2.8K
29
2.5K
613.9K
nahcrof
nahcrof@nahcrof·
Alright, CrofAI is officially the cheapest option for deepseek-v4-pro have fun saving tons :)
nahcrof tweet media
English
39
13
452
34.6K
Sudo su
Sudo su@sudoingX·
a week with the dgx spark, here is what is on it and what i have measured so far. nobody is really talking about this machine and it is quietly becoming the workhorse of my whole stack. hardware: nvidia gb10 sm_121, 124 gb unified lpddr5x at 273 gb/s, cuda 13.0 models on disk (305 gb total, 9 ggufs): > qwen 3.6 27b q4_k_m / q5_k_m / q8_0 / ud-q4_k_xl > nemotron 3 omni 30b-a3b q4_k_m / q8_0 / ud-q6_k / ud-q6_k_xl > deepseek v4-flash 158b q4_k_m (112 gb, flagship 128gb-tier test) terminal + shell environment: > zsh + oh-my-zsh + powerlevel10k theme > modern cli stack: bat, eza, ripgrep, fd, git-delta, tldr, neovim, fzf, autojump > 6 tmux sessions actively running for parallel agent work ml + agent stack: > llama.cpp built sm_121 against cuda 13 > uv + venv ml stack with pytorch 2.11.0+cu130 (aarch64) + transformers + diffusers + accelerate > hermes agent v0.11 with codex auth bridge > opencode for free-model overnight research > telegram gateway routing to nemotron q8 right now speeds verified so far: - nemotron 30b-a3b q8: 56 tok/s gen, 1,300 tok/s prefill, 96% gpu, 33gb in unified - qwen 27b dense q4: 40 tok/s consistent 90+ gb of unified memory still free. deepseek v4-flash 158b loading next as the real flagship test, multimodal omni testing once mmproj pulls, comfyui install in flight for the diffusion lane. honestly curious what the actual limit is on this box, i have not hit it yet.
Sudo su tweet media
English
64
30
456
65.1K
Drew Bredvick
Drew Bredvick@DBredvick·
@nicdunz one of my personal benchmarks shows you'll be just fine
Drew Bredvick tweet media
English
3
0
27
3K
nic
nic@nicdunz·
codex usage is getting low... time to switch to gpt-5.5 low
English
28
2
379
23.5K
nahcrof
nahcrof@nahcrof·
@steppebuddha Nope, you can check the privacy page if you’d like
English
1
0
16
1.7K
antirez
antirez@antirez·
Today local inference of LLMs is a huge amount of mixed quality inference with often small details off, wrong sampling parameters, unclear real world performances (eg: qwen 3.6 with thinking is impossible bc of verbosity, but benchmarks are done this way). Not very good.
English
16
7
101
9.9K
antirez
antirez@antirez·
DeepSeek v4 Flash is at the limit of being usable, with 21 t/s and prefill at 130 t/s (but it is 2x or even better on the m3 Ultra), but... it has certain things that make it so much suited for local inference. For once: it acts as a frontier model, and thinks the right time. Second: the KV cache is crazy compact, so it is really possible to do KV checkpointing on disk.
English
3
1
15
3.9K
antirez
antirez@antirez·
Look at this. Also opencode uses freaking 11k tokens of system prompt. Even at decent pre-fill of ~130 t/s it means waiting 84 seconds to start a session. What's the point? :D The pi agent is a lot saner here. Moreover, one could say, let's cache on disk very long common KV cache chunks, no? Hash it with all the parameters and put a sensible TTL if not used. But also: only cache it if you see it repeated N times across different sessions.
antirez tweet media
English
31
12
346
44.8K
Felipe Coury 🦀
Felipe Coury 🦀@fcoury·
/goal also lands in Codex CLI 0.128.0. Our take on the Ralph loop: keep a goal alive across turns. Don't stop until it's achieved. Built by my co-worker and OpenAI mentor Eric Traut, aka the Pyright guy. One of the GOATs I get to work with daily.
English
174
245
3.6K
881.8K