RayBytes

21 posts

RayBytes

RayBytes

@raybytez

Katılım Haziran 2023
13 Takip Edilen0 Takipçiler
RayBytes
RayBytes@raybytez·
@thegenioo Provider “Weights and Biases”. I don’t think it’s at 500 tk/s always since I only got one result like that, but it’s plenty fast. 15 seconds for a fully functioning app.
RayBytes tweet media
English
1
0
0
37
Hamza
Hamza@thegenioo·
@raybytez bro which provider on OpenRouter is giving you 500t/s lol
English
1
0
0
438
Hamza
Hamza@thegenioo·
Honestly I think Moonshot fumbled big time with Kimi K2.6 Their previous model K2.5 was just so good, and it just needed that perfect polish and a few bits of upgrades to get a really strong, cheap model. And you know what? That exists! Yes, it is Composer 2.5. This is what Moonshot should have done: K2.5 should have been polished like Composer 2.5 and released as K2.6 Now don't get me wrong, K2.6 is a very powerful and strong model, but it has some issues: - It just overthinks the hell out of things and gets stuck in long, endless thinking loops - It is unbelievably slow, like really slow - It is good, but I found DeepSeek and Qwen models more efficient, workable, and faster So what Cursor has pulled off here with Composer 2.5 should have been done by Moonshot with the release of Kimi K2.6, and I hope they fix these issues with K3
English
24
9
234
25.2K
RayBytes
RayBytes@raybytez·
@gtbot2007 @Pikaclicks @OsuKanade @ibxtoycat He filed the trademark too late, once he realised there were material gains to be made. If the trademark went through, skyblock as we know it could be incredibly different. With it not being trademarked, anyone can use skyblock as they want to. The alternative makes no sense.
English
1
0
0
27
Andrew (Toycat)
Andrew (Toycat)@ibxtoycat·
An update on the "skyblock" legal case: it is now legally a generic term within Minecraft, and not one that can be trademarked.
Andrew (Toycat) tweet media
English
26
53
2K
126.8K
RayBytes
RayBytes@raybytez·
@christofsalis @theo I mean, there #1 example and something they advertised massively was using thousands of agents with Gemini 3.5 Flash to make an OS, with a full new agentic system push with v2 of Antigravity.
English
1
0
1
78
Christof Salis
Christof Salis@christofsalis·
@theo This is an agentic benchmark I am guessing, right? I noticed that google really doesn't seem to be too interested in agents (just my hear-say). Very knowledgeable models but not interessted in chaining tool calls.
English
1
0
7
2.6K
RayBytes
RayBytes@raybytez·
@scaling01 It’s more expensive than 3.1 pro in real terms. Check artificial analysis’s cost to run the benchmark. It cost 74% more than 3.1 Pro, while losing to 5.5 Medium which is cheaper than it. ($1199 5.5 Medium vs $1552 3.5 Flash). 5.5 Medium is also smarter (57 score vs 55).
English
0
0
1
159
salt
salt@saltjsx·
@zxnelli t3 chat isn't really for normies
English
4
0
2
3K
salt
salt@saltjsx·
ChatGPT sucks! So I'm building something way better.
English
98
10
222
85.9K
RayBytes
RayBytes@raybytez·
@Getlucky12341 @Kirito1262 It was about money, not about claiming anything. He filed his trade mark claim too late, once everyone in the community was already branching of the idea.
English
0
0
6
1.9K
Getlucky
Getlucky@Getlucky12341·
@Kirito1262 Not wanting other people to claim they have the "og skyblock" is pretty fair tbh
English
7
3
249
24K
Getlucky
Getlucky@Getlucky12341·
The original Skyblock map in Minecraft is lost media?
Getlucky tweet media
English
24
74
3.7K
866K
RayBytes
RayBytes@raybytez·
@rbranson @gajesh GPT-5.5 is currently at 38 tps… And anthropic is running Opus 4.7 at 39 tps. It’s perfectly usable tbh
English
2
0
7
203
StoicYield
StoicYield@StoicYield·
@GrantSlatton If it doesnt cost you anything, why did you have to charge $10 besides greed?
English
10
0
10
6.5K
Grant Slatton
Grant Slatton@GrantSlatton·
what do the "no such thing as ethical billionaires" people say about an individual who codes a great app that sells 100 million copies for $10 each who was exploited? zero marginal cost goods are weird
English
182
30
2.9K
160.8K
Mihir
Mihir@mihirmodi·
@AppleBytesPhD @MaxRovensky @Mrwhosetheboss No, they said that for a user upgrading from M1, the choice would be between an M4 and M5 and (apparently, as per them) this info is not available.
English
3
0
34
860
Arun Maini
Arun Maini@Mrwhosetheboss·
Tech companies are basically lying to you: - They compare their new products to ones released MANY years ago (to make it as confusing as possible to figure out how much has actually changed this time) - They invent new specs that mean absolutely nothing (like how much zoom their phone cameras can do) - They write “up to” just before telling you how much their new product has improved by (so you can’t sue them when you don’t get those numbers) And a LOT more So, I decided it was time to break it down with @MKBHD on YouTube - video live now
English
287
422
10.1K
7.5M
RayBytes
RayBytes@raybytez·
@Samarmendiratta @GregoryMcFadden @borntoFLY11135 @ajisharul @Mrwhosetheboss His video and what he shows in the title card is the “AI performance” part. Here they compare directly, for e.g “time to first token” (ttft) which is a legitimate way to benchmark performance. And you can see how they got said perf with what model in a footnote at the bottom.
English
0
0
2
131
RayBytes
RayBytes@raybytez·
@nahcrof it seems p slow? Do you have plans for a lightning version?
English
0
0
0
269
nahcrof
nahcrof@nahcrof·
kimi-k2.6 is now available on CrofAI! $0.55/m input $0.11/m cached $2.70/m output let me know if you have any issues :) (the model is currently set as precision since this was the community choice but if the cost is low enough or I can get it low enough I'll change it)
English
21
6
194
16.3K
Ankit
Ankit@anKit0017_·
@Kimi_Moonshot Input prices are still high, is K2.6 really worth the cost, or is this just another incremental update?
English
4
0
8
10.1K
Kimi.ai
Kimi.ai@Kimi_Moonshot·
📢 Kimi K2.6 API is live • Input Price (Cache Hit): $0.16 / M tokens • Input Price (Cache Miss): $0.95 / M tokens • Output: $4.00 / M tokens Kimi K2.6 is our latest + most intelligent model - stronger long-horizon coding, better instruction following & self-correction. Native multimodal (text/image/video), thinking + non-thinking, 256K context. Supports tool calls, JSON / Partial mode, and web search. Try it → platform.kimi.ai
Kimi.ai tweet media
English
84
195
3.2K
765.9K
RayBytes retweetledi
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…
Kimi.ai tweet media
English
934
2.4K
18.2K
7.5M
RayBytes
RayBytes@raybytez·
@bstnxbt Hey, have not been able to replicate your findings on my M1 Pro 16gb. Had made an issue on your repo where other people also found the same problem, but you haven’t responded yet. Mind taking a look?
English
0
0
2
276
bstn 👁️
bstn 👁️@bstnxbt·
Just tested Qwen3.6-35B-A3B-4bit on DFlash. M5 Max, 40-core GPU, stock mlx_lm baseline: ► @ 1024 · 138.98 → 232.64 tok/s (1.67x) · 88.4% acceptance ► @ 2048 · 136.74 → 224.88 tok/s (1.64x) · 88.4% acceptance ► @ 4096 · 128.39 → 170.95 tok/s (1.33x) · 86.3% acceptance Working on custom Metal kernels to improve long-context decode and optimize the quantized model path.
English
25
23
207
14.9K
RayBytes
RayBytes@raybytez·
@techdroider How is making everything cpu bound a fair setup, especially for an LLM benchmark?
English
0
0
0
243
TechDroider
TechDroider@techdroider·
Ran a fresh AI benchmark using Gemma 4 with Temperature set to 0 and Accelerator forced to CPU for a fair comparison, especially since Pixel doesn’t support GPU acceleration in this setup. Devices tested: iPhone 17 Pro Max, Samsung S26 Ultra, Pixel 10 Pro XL, OnePlus 15, Xiaomi 17 Pro Max.
English
17
17
239
24.8K
RayBytes
RayBytes@raybytez·
@theo I think you’ve just discovered twitter man, not really representative of the majority, just a vocal minority
English
0
0
0
147
RayBytes
RayBytes@raybytez·
@ZoellaBolkiah @theo Okay, sure, but then that’s $30 million input tokens, and $150 per million output. Could’ve even used a better example of Gemini 3 flash for e.g. eitherway, the fastest model on openrouter is currently openai’s oss model, so idk if throughput is the main issue
English
0
0
0
82
Zoe
Zoe@zosyrai·
@theo their LLLMs output 60t/s so they have more time to complain and wait until they have to reprompt it again 50 times
English
2
1
4
992
RayBytes
RayBytes@raybytez·
@jimmyjames_tech Is Metal comparable to NVIDIA’s OpenCLscore on geekbench? Do they normalise the score somehow?
English
0
0
0
21