


xTUBOL ./
240 posts









⬜️⬜️⬜️ ⬜️ ⬜️ ⬜️ ⬜️⬜️ ⬜️⬜️⬜️ ⬜️⬜️⬜️ ⬜️⬜️⬜️ ⬜️ ⬜️ ⬜️ ⬜️ ⬜️ ⬜️ ⬜️⬜️⬜️ ⬜️⬜️⬜️ ⬜️ ⬜️ ⬜️⬜️⬜️ ◻️◻️ ./ training mode on… @Gradient_HQ

fairytale is imaginary. intelligence is forever. ./ asyncing echo rl… @Gradient_HQ


VeriLLM - Bringing Integrity and Verification to Distributed Intelligence. for less than 1% of the inference cost you can verify if the output is truly what you requested. engineering distributed inference with fully verifiable transparency. current solutions of - cross checking outputs introduces redundancy in multiplying cost from the comparisons for outputs. - zkp’s computational complexity which introduces significant latency making it impractical for on demand inference. both of which can significantly impact scalability and financial cost. @Gradient_HQ addresses the issues of models being swapped, output tampering and high cost with the introduction VeriLLM. both inference & verification are served in the same worker pool. reducing cost and maximizing utilization. here are the evaluations of VeriLLM serving inferences on heterogeneous machines table 3 compares the output of the Qwen2.5-7B-Instruct model running on an Mac M4 vs an RTX 5090. this establishes how much "natural" numerical variation exists between different machines: - low mean (near zero, ranging from -0.003 to 0.009) and predominance of small differences (most delta < 0.2) table 4 compares a compressed model (AWQ quantized) running on an RTX 5090 vs standard model running on a Mac M4. this tests if the verification protocol can still work when the "worker" uses a faster, lower precision version of the model: - exact matches are near zero, large delta (>0.2 and >5) dominate and scale with length and mean is consistently non-zero (up to 0.021) with alternating signs. table 4 highlights dishonest work from worker using quantization, which is exactly what VeriLLM aims to catch. models being swapped or substituted rigging the output. VeriLLM is able to identify honest full precision runs from quantized ones, across different machines.



A previous display of Echo trained 30B Sokoban, leading performance against much larger model comparisons of DeepSeek R1 and GPT-OSS-120B ./ Echo by @Gradient_HQ scales reinforcement learning with consumer machines, drastically reducing the cost of building better intelligences









Gradient Cloud, the new go to powerhouse for developing with AI, fully powered via Gradient Distributed AI Stack. Intelligence should be fast, accessible & collectively owned. Operate leading models at production speed for a fraction of the cost.