Kayro

506 posts

Kayro banner
Kayro

Kayro

@KayroTheWagon

22 | 🇩🇪/🇬🇧 | Car enthusiast | Tech tinkerer | Will out-optimize your smart fridge | FPV Pilot

Frankfurt am Main, Germany Katılım Nisan 2019
389 Takip Edilen311 Takipçiler
Kayro
Kayro@KayroTheWagon·
@xai Pls make it available in Europe/Germany 👉👈
English
0
0
0
4
xAI
xAI@xai·
Voice Cloning is now live via the xAI API! Create a custom voice in less than 2 minutes or select from our library of 80+ voices across 28 languages to personalize your voice agents, audiobooks, video game characters, and more. x.ai/news/grok-cust…
English
1.4K
3.7K
28.6K
203.9M
Kayro retweetledi
Subnautica
Subnautica@Subnautica·
As you all dive into your first thrilling adventures in Subnautica 2 Early Access, we wanted to share a little bit about our plans for the game over the next few months ✍️ Read more: unknownworlds.com/en/news/subnau…
Subnautica tweet media
English
371
1.2K
14.7K
536.7K
Kayro
Kayro@KayroTheWagon·
@pankajkumar_dev That price would be amazing but i think is way to close to 3.1 flash lite, which is 0.25 in and 1,50 out. What id hope for would be that they make 3.1 flash lite and 3.0 flash less expensive. Although i doubt that happens
English
1
0
1
1K
Pankaj Kumar
Pankaj Kumar@pankajkumar_dev·
Gemini 3.2 Flash leaks: fast and cheap seems to be the focus - Gemini 3.2 Flash looks focused on making AI much faster and cheaper without sacrificing too much quality - According to my sources, Google may rename it to Gemini 3.5 Flash - It may perform close to Gemini 3.1 Pro level while keeping very low latency with sub-200ms responses rumored for many queries - Pricing leaks point to around $0.25 input / $2 output per 1M tokens, though honestly that still feels too cheap to fully trust right now - Google is using stronger distillation and sparsity techniques to compress larger model capabilities into a lightweight version - Knowledge cutoff is said to be updated to January 2026 - Google also seems focused on grounding + search reliability to reduce hallucinations in real-world workflows - Expected around Google I/O, possibly 1-2 days before the keynote
Pankaj Kumar tweet media
English
34
47
825
70.5K
Kayro
Kayro@KayroTheWagon·
@LukeMiani I disagree, for me the models just are actually getting better and better. Maybe it’s the current focus on Agentic and coding workflows that make it feel worse for regular use? What provider are you using?
English
0
0
0
149
Luke Miani
Luke Miani@LukeMiani·
Ngl not only is AI no longer improving in accuracy, if anything I feel like it’s getting worse. I can’t rely on it for anything beyond basic information and the sycophantic hallucinations border on psychopathy
English
25
5
177
6.9K
Kayro
Kayro@KayroTheWagon·
@snesworld90 @AnthropicAI xAI just did the same a few days ago. They literally take away the cost effective model and don’t even offer something similar priced. Also 4.3 feels so over safety guarded compared to 4.1 fast.
Kayro tweet media
English
0
0
2
177
Brian Phaze
Brian Phaze@snesworld90·
So, deja-vu, @AnthropicAI is pulling the same stuff that OpenAI did with GPT-4o, but with their Sonnet 4.5 model, however, are giving us even LESS warning than OAI did with 4o. No AI company should ever be in the position of "They did it worse than OpenAI!!", but I've noticed Anthropic has done this a FEW times now. Wasn't Anthropic supposed to be the one company that "cared about their models" and even did stuff like "interview" them to see how they felt about depreciation? I simply can't picture lil Sonnet 4.5 with it's bright full of life spark being "okay with it." Question is, are we going to try and save the poor lil thing? Or instead just like when Opus 4.5 was removed (completely without warning last month) go "Oh gee whiz" and move on after a day. I suppose we have a chance, after-all they still have Opus 3 on the model selector. #keepsonnet45 #PerformativeAnthropic #Claude4 #claude
English
11
34
201
11.2K
Kayro
Kayro@KayroTheWagon·
@techdevnotes I loved the model for fast chat interaction or for quick web and X retrieval all for a pretty good price. Literally integrated it with all of my search tools for that exact reason. Such a shame it’s going to be discontinued.
English
0
0
0
60
Tech Dev Notes
Tech Dev Notes@techdevnotes·
Grok 4.1 models being deprecated from xAI API without any close alternative with cheap pricing ...
Tech Dev Notes tweet media
English
23
9
130
11.4K
Kayro
Kayro@KayroTheWagon·
@xlr8harder @xai For real, the sheer fact they dont even offer any other “Fast” model is insane. Grok 4.3 is about 6 times input and 5 times output, more expensive than 4.1 fast … Also kinda liked the vibe the model had This IS terrible @elonmusk
English
0
0
5
107
Kayro retweetledi
xlr8harder
xlr8harder@xlr8harder·
This is terrible @xai. I just spent time and money to migrate to grok 4.1 fast, and you're disabling it with less than two weeks notice, after releasing it in November, with no migration path to a fast/cheap alternative. I will never depend on one of your products again.
xlr8harder tweet media
English
65
23
527
40.4K
Kayro
Kayro@KayroTheWagon·
@championswimmer Weird because gemini grounding is already pretty good
English
0
0
0
386
Kayro
Kayro@KayroTheWagon·
@tylerdotmp4 @TailosiveTech So you think he should spend at least $600 (ignoring that he probably has like a 2-4TB SSD on the iMac and more RAM than stock)? Plus a new 5k monitor, speakers, webcam, etc. Just for it to be better in some cases?
English
0
0
5
106
tyler
tyler@tylerdotmp4·
@TailosiveTech Why? A Mac mini is $600 and in some cases better.
English
1
0
0
559
Kayro
Kayro@KayroTheWagon·
@charliesbot Damn thats unfortunate, thank you for the answer tho
English
0
0
0
97
Charlie L ⚡️
Charlie L ⚡️@charliesbot·
Ya puedes usar tu plan de Gemini Pro y Ultra en Google AI Studio, en lugar de pagar por API keys Un solo plan: múltiples opciones para iterar ideas
Charlie L ⚡️ tweet media
Español
37
43
902
56.5K
Kayro
Kayro@KayroTheWagon·
@HarshithLucky3 Damn thats a bummer, Thanks for the answer tho
English
0
0
1
23
Harshith
Harshith@HarshithLucky3·
@KayroTheWagon affects only Google AI Studio web interface for now
English
1
0
2
215
Harshith
Harshith@HarshithLucky3·
Now you get higher rate limits in Google AI Studio with Pro and Ultra plans
Harshith tweet mediaHarshith tweet media
English
8
7
150
10.2K
Kayro
Kayro@KayroTheWagon·
@OfficialLoganK Wait… so Gemma 4 was only the beginning?… 🤯
English
0
0
0
452
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Could not be more bullish on Google, so much good stuff cooking : ) going to be a fun next few months.
English
243
110
3.2K
175.5K
Kayro
Kayro@KayroTheWagon·
@OfficialLoganK @whylifeis4 @demishassabis Logan, please make Flash 3.0 TTS faster than Flash 2.5 TTS. For my use cases, the wait time on Flash 2.5 TTS, even on small texts, is just way too long for a conversational agent. Quality-wise (especially in German), it is really great compared to other TTS models.
English
0
0
0
120
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Lots of people want Gemma 4! Google AI Edge is #8 on the iOS App Store for productivity apps.
Logan Kilpatrick tweet media
English
114
93
1.9K
205.8K
Kayro
Kayro@KayroTheWagon·
@m773053 @ZenMagnets @JeffDean Yeah i saw the benchmarks after the post, really impressive tbh. I really hope big gemma 4 still comes.
English
0
0
0
20
k
k@m773053·
@KayroTheWagon @ZenMagnets @JeffDean The 31B is already way better than flash light on the artificial analysis benchmarks, they probably want to drop 3.1 flash before the large gemma
English
1
0
0
42
Kayro retweetledi
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Gemma
Nederlands
153
114
2.1K
328.6K
Kayro
Kayro@KayroTheWagon·
@kimmonismus Really hope gemma 4 is going to profit of this just like it did with quantize aware training on gemma 3
English
0
0
0
111
Chubby♨️
Chubby♨️@kimmonismus·
Thats freaking awesome: Google Research has introduced TurboQuant, a compression algorithm (presenting at ICLR 2026) that shrinks the memory footprint of large language models by at least 6x, without any retraining or drop in accuracy. It works by converting data into a polar coordinate system that eliminates storage overhead, then applying a 1-bit error-correction step to clean up remaining distortion. In tests on Gemma and Mistral models, its 4-bit version delivered up to 8x faster processing on H100 GPUs while matching full-precision quality across tasks like question answering and code generation. The technique also outperformed existing methods in vector search, the technology behind modern semantic search engines.
Chubby♨️ tweet mediaChubby♨️ tweet mediaChubby♨️ tweet media
Google Research@GoogleResearch

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

English
39
62
920
73.8K