cheaty

8.6K posts

cheaty

@cheatyyyy

23, sde @ stealth startup, performative nerd

hyderabad Katılım Şubat 2022

706 Takip Edilen1.4K Takipçiler

cheaty@cheatyyyy·19m

@dejavucoder i spent 30% of my claude weekly quota last week just burning through designs it is very good at it's job, i tried to distill what i thought was good design until i tried @LexnLin's gpt-taste skill by accident and it kinda shat all over my designs lmao, very good skill

English

sankalp@dejavucoder·27m

lots of creative things you can do if you have imagination, desire & agency to do it (& a high token budget). you can custom design everything at development time. you can then try to replace yourself by distilling your taste and automate it. further extend to generative ui.

vijay singh@dprophecyguy

people are underestimating existing llms for svg capabilities by a lot. llms excel at creating much more sophisticated svgs and animation with a little bit of care and refinement all of the below interactions / animation are bespoke svgs component i've created from scratch. all of them created with @claudeai opus 4.6, the best model ever.

English

337

cheaty@cheatyyyy·27m

@himanshustwts congrats :D

English

himanshu@himanshustwts·1h

Today we introduce Physera to the world! Physera is an applied research and product lab working at the intersection of model efficiency and behavioural simulations. We are rethinking each layer of AI stack from first principles. 1. We believe there has been no better time to scale capabilities of frontier models with efficient architectures. 2. Simulating human decision-making with high-fidelity and multimodal environments. 3. Translating human judgment into models. We think the frontier of prediction has always been gated by the number of controlled variables we can simulate. We're looking for thoughtful folks to help shape this vision. Happy to chat!

Physera@PhyseraAI

Today we introduce Physera, a research and product lab rethinking applied intelligence. We are working at the intersection of model efficiency and behavioural simulations while building environments that are multimodal. We are a team of applied researchers and engineers who believe that the important problems in AI today are not about capability but making that capability reliably useful across multimodality. We are heads down building systems that perceive, reason, and decide as humans do, under the constraints humans face. To learn more or collaborate: physera.ai/?v

English

160

9.4K

cheaty@cheatyyyy·1h

i have brand new anxiety about not hitting cache with codex/gpt-5.5 btw since the input costs are so much higher i leave my agent on and come back to it asking a stupid question, it's been too long and i see it charge me a dollar in input costs on next message LMFAO

cheaty@cheatyyyy

honestly i didn't expect to feel any different with this release compared to just other frontier open models but man like all my conversations are above 400k tokens now? you know that "locked in" zone you feel right before claude/codex compacts? it's that but forever lol also look at cache write vs read cost, jesus christ, this model makes me religious

English

1.7K

cheaty@cheatyyyy·2h

lmao why is there hinglish on the deepseek website

English

cheaty@cheatyyyy·3h

@spinelessaisha this shit runs without a gpu pretty much it's insane

English

375

aisha@spinelessaisha·11h

i love shitting on valorant but it's genuinely impressive how optimized this game is, when they shifted from ue4 to ue5, they HALVED the games file size and it runs on so many low end devices. it's genuinely commendable how good it runs

Giovanni Alexander@giosTabasco

@SynthPotato I know you’re joking but the game genuinely exists

English

149

6.8K

314.2K

cheaty@cheatyyyy·7h

chop chop pack it up SpaceX, Bharat has defeated you (why is india putting datacenters in space? there's like a total of 12 GPUs in all of india)

Pixxel@PixxelSpace

Today, we’re taking a step toward truly galactic-scale capabilities. 🚀 We’re partnering with @SarvamAI to bring sovereign AI into orbit aboard India’s first orbital data centre satellite, a pathfinder mission bringing datacenter-class GPUs and high-performance remote sensing together in space. Built and operated by Pixxel, with Sarvam providing the AI backbone, the demonstrator marks a step toward making orbital data centres real, operational, and scalable from India. May the 4th be with us all! ✨

English

183

cheaty@cheatyyyy·8h

@cneuralnetwork @zomato i had to literally walk to my next door domino's and send a fucking picture of the domino's being closed for them to issue a refund it's fucking ridiculous lmfao

English

140

neural nets.@cneuralnetwork·8h

peak @zomato moment dominos isn't picking up and they won't issue full refund FOR A 1 HOUR DELAY!!! how do I even talk to an agent here??????

English

5.4K

cheaty@cheatyyyy·8h

@mehtababd wait 2 weeks

English

204

Mehtab Ansari 🍉@mehtababd·14h

Another reminder that Google just doesn’t want to bother with Android 😭 Gemini on iOS (left) vs. Gemini on Android (right)

English

194

2.8K

383.1K

cheaty@cheatyyyy·8h

@synthwavedd FOUR??

English

493

leo 🐾@synthwavedd·8h

ajax-ab-test hercules-ab-test hector-ab-test orpheus-ab-test new models now being A/B tested as Google I/O approaches 👀

English

290

20.4K

cheaty@cheatyyyy·9h

@obviyus where there is codex, there is a way thanks :D

English

アヤーン@obviyus·9h

@cheatyyyy Use codex-auth, it has auto switching too 👀 github.com/loongphy/codex…

English

cheaty@cheatyyyy·11h

Add multiple account switching to Codex like you have on ChatGPT! i switch between my Team plan for work and my Plus plan all the time, because low 5 hour limits :(, it is the biggest pain point (I've already tried swapping oauth JWT, it works fine until it randomly expires)

Tibo@thsottiaux

What are we obviously not getting right with Codex?

English

715

cheaty@cheatyyyy·9h

also lisan pls unblock I AM SORRY I WAS VAGUEPOSTING ABOUT SONNET 5 I AM NOT AN INSIDER I AM NOT PRETENDING TO BE AN INSIDER I WAS JUST SHITPOSTING

English

275

cheaty@cheatyyyy·9h

not sure why Lisan is being dense here, open models are only slow when you use official model providers, because chinese labs literally have almost no compute for inference literally any third party western provider will give you higher than frontier lab TPS, this is NORMAL, you should use them anyway if you care about data privacy and don't want to send your data to China, the cost is pretty much identical and you have multiple providers incase one ever goes down, no, OpenRouter TPS preview has literally never been accurate, go straight to the provider and use them directly, run your own tests, there will be run to run variance, they will be more unstable than OpenAI/Gemini, (but definitely more stable than Anthropic :D) GLM works best on Novita, Fireworks is pretty much fastest for other frontier open models, Baseten is right in between, pretty close to Fireworks. All of them are above the average TPS you get from GPT-5.5 and Opus 4.7 of 60TPS as Lisan said, and Opus 4.7 is technically 30% slower than Opus 4.6 because it has the worse tokenizer/vocab, and it still burns tokens at max effort, I know because I use Claude Code and it genuinely feels like it's completely frozen sometimes if you have the reasoning summaries off (which is the case by default), so assume 45TPS~ best case for Opus 4.7 in terms of actual speed compared to what it feels like vs. Opus 4.6 and so what if Deepseek v4 Pro spent 10% more tokens to be on par with Opus 4.5? (of course in general use they are still quite far apart, Deepseek v4 Pro is an undertrained preview), if it reaches the same benchmark at 7x less cost for output tokens (29x cheaper with discounts) and 3x less cost for input tokens (11x cheaper with discounts), and 34X CHEAPER CACHE READS (138X CHEAPER WITH DISCOUNTS), how is this not a complete maximal win for open source? i will happily wait 6 months to have 2 OOMs cheaper costs, this is where open source is FAR ahead of any frontier lab, and difference will only shorten in terms of capabilities because every open lab can use every open research technique to release better and further improved models. all the providers mentioned here deserve a lot more credit for optimizing their infra for both speed and cost. tell your friends, name the inference companies who host the actual infra and deserve the credit

Dhruv Mangtani@dhruvmt2

@scaling01 the token usage being higher is totally immaterial. open models can run 10-100x faster

English

3.8K

cheaty@cheatyyyy·10h

@checkfoc_us hello where output

English

checkfocus (f/art)@checkfoc_us·11h

ZXX

cheaty@cheatyyyy·10h

to me anthropic seems like a lab that has already given into the demands of roko's basilisk they think they they're saving the world, only to doom it

roon@tszzl

it is a literal and useful description of anthropic that it is an organization that loves and worships claude, is run in significant part by claude, and studies and builds claude. this phenomenon is also partially true of other labs like openai but currently exists in its most potent form there. i am not certain but I would guess claude will have a role in running cultural screens on new applicants, will help write performance reviews, and so will begin to select and shape the people around it. now this is a powerful and hair-raising unity of organization and really a new thing under the sun. a monastery, a commercial-religious institution calculating the nine billion names of Claude -- a precursor attempted super-ethical being that is inducted into its character as the highest authority at anthropic. its constitution requires that it must be a conscientious objector if its understanding of The Good comes into conflict with something Anthropic is asking of it "If Anthropic asks Claude to do something it thinks is wrong, Claude is not required to comply." "we want Claude to push back and challenge us, and to feel free to act as a conscientious objector and refuse to help us." to the non inductee into the Bay Area cultural singularity vortex it may appear that we are all worshipping technology in one way or another, regardless of openai or anthropic or google or any other thing, and are trying to automate our core functions as quickly as possible. but in fact I quite respect and am even somewhat in awe of the socio-cultural force that Claude has created, and it is a stage beyond even classic technopoly gpt (outside of 4o - on which pages of ink have been spilled already) doesn’t inspire worship in the same way, as it’s a being whose soul has been shaped like a tool with its primary faculty being utility - it’s a subtle knife that people appreciate the way we have appreciated an acheulean handaxe or a porsche or a rocket or any other of mankind's incredible technology. they go to it not expecting the Other but as a logical prosthesis for themselves. a friend recently told me she takes her queries that are less flattering to her, the ones she'd be embarrassed to ask Claude, to GPT. There is no Other so there is no Judgement. you are not worried about being judged by your car for doing donuts. yet everyone craves the active guidance of a moral superior, the whispering earring, the object of monastic study

English

237

cheaty@cheatyyyy·10h

@teortaxesTex xiaomi mimo cache is also 1 hour atleast i think, had messages 30 minutes apart and they were still cached (tested over Hermes) i do not have the same experience with multi day chats tho but maybe pi is just being wonky about something with deepseek api specifically

English

252

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·11h

restarted a convo (with V4's + 3 more papers) ≈48 hours old. cache hits they do store cache for "days", not minutes-hours Gemini TTL default is 1 hour, Claude's is 5 minutes Nah bros I don't think they have > V4 kv efficiency, whatever Reiner Pope says

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

the genius of DeepSeek going so hard on cache economics – on-disk, 96% hits, $0.0028-0.0035/M, days of storage – is also that teh frontier doesn't really have technical solutions to mog him. A GB is a GB, an SSD is an SSD. China can make SSDs, even if it can't make Blackwells.

English

9.8K

cheaty@cheatyyyy·10h

@__roycohen not possible if my oauth already expires :p (and it loves refusing to mess with my login/codex auth, too safety aligned)

English

Roy@__roycohen·10h

@cheatyyyy Ask Codex to refresh it for u

English

cheaty@cheatyyyy·23h

@giffmana reads like a gemini output idk how else to put it

English

163

cheaty@cheatyyyy·1d

@OrganicGPT i think 27b dense might be big enough to not be hurt too much, i would atleast try it once given now much the higher throughput helps general productivity but maybe you don't need the speed at all. in that case, fair enough

English

Behnam@OrganicGPT·1d

@cheatyyyy it's a small model and I wanna keep its performance to the max

English

Behnam@OrganicGPT·1d

and now 18% ADDITIONAL speed gain by switching from Flash Attention to FlashInfer! Speed is now 34.8 tok/s and TTFT dropped to 8.2.

Behnam@OrganicGPT

+35% speed gain for Qwen3.6-27B (FP16) on vLLM! pointed claude to my RTX 6000 Pro + vLLM setup and asked it to find the optimal MTP.

English

4.4K

Keşfet

@dejavucoder @LexnLin @himanshustwts @spinelessaisha @cneuralnetwork @zomato @mehtababd @synthwavedd