Axly

430 posts

Axly

Axly

@AxlysCustoms

Fantasy, Modern & Sci‑Fi AI Art | SFW + NSFW | Custom Bundles, Wallpapers & Commissions - https://t.co/uNrpROP4QL

가입일 Ağustos 2025
40 팔로잉34 팔로워
Axly
Axly@AxlysCustoms·
@TheAhmadOsman @huggingface 2x RTX6000s is not the normal “running at home on your own hardware” experience… I’m just sayin’ :)
English
0
0
0
14
Ahmad
Ahmad@TheAhmadOsman·
Just spent a couple hours playing with Hermes Agent (MiniMax M2.5 on a 2× RTX PRO 6000 node) Genuinely impressive experience MiniMax M2.7 weights will be the closest we’ve ever gotten to a fully local “Claude Code + Opus 4.6” experience Running on your own hardware at home
Nous Research@NousResearch

@TheAhmadOsman He should try Hermes Agent

English
82
35
933
87.1K
Axly
Axly@AxlysCustoms·
Built a benchmarking app and a settings optimizer for local LLMs today … now I’m at 9% … honestly, I struggle to break 30% most weeks. But I hit limits constantly on the pro plan, usually maxed it by day 5. And I build much more complicated apps now. A buddy has similar results. We can’t figure out what we’re doing wrong :)
English
0
0
0
28
Nathan Roberton
Nathan Roberton@versezine·
@AxlysCustoms @WesRoth It's probably coming. I too had a productive weekend but am suddenly locked out this morning after barely any work at all.
English
1
0
0
38
Wes Roth
Wes Roth@WesRoth·
Following a weekend of expanded usage allowances, Anthropic’s highest-tier subscribers are waking up to a crippling rate-limit bug. Developers paying top dollar for the "Claude Max" ($100/mo) and "Max 20x" ($200/mo) plans are reporting that their accounts are being locked out almost instantly due to an issue with how Claude Code is calculating token consumption.
Wes Roth tweet mediaWes Roth tweet media
Brad Groux@BradGroux

Something is up with Claude Code usage today. $200 Claude Max, 0%, 52% to 62%, then 68%, 76% and 84% in 5-hour rolling window in the time it took me to write this tweet. WTF, @AnthropicAI? I'm working on one GitHub PR for regression testing. Not folding proteins to cure cancer.

English
45
21
280
31K
Axly
Axly@AxlysCustoms·
I worked all weekend on a few apps and websites. Used like 3% … wrote a full app today and worked on one yesterday that was fighting me, so 5-6 hours of steady use (don’t finish it as the new Anthropic computer control thing made it obsolete). I’m currently at 9% (reset Friday at 5pm). Opus 4.6 high effort (and sonnet building PRDs and general “brainstorming” the apps themselves. Max x5 … which I had to move to because I hit limits on pro constantly.
English
0
0
0
36
Umut 🦀
Umut 🦀@umut_ozdemir_·
@HuggingModels It's extremely hard to jailbreak the new qwen models, I tried it for hours, doesn't belive anything. This is cool.
English
2
0
0
2.1K
Hugging Models
Hugging Models@HuggingModels·
Meet Qwen3.5-9B-Uncensored: a powerful, open-source language model that's causing a stir. It's a 9-billion parameter model designed for unrestricted text generation, offering developers a flexible alternative to heavily filtered AI systems. Perfect for creative and experimental projects.
Hugging Models tweet media
English
28
110
1.3K
76.1K
Axly
Axly@AxlysCustoms·
@Farekrow @_yorunoken @HuggingModels Huahua has vision. And I’ve yet to have to tell me no on anything .. that’s the one I’ve been using . The 2,4 and 9b versions in fact
English
0
0
0
9
Axly
Axly@AxlysCustoms·
Oh I love the 4b (and the 9b!) too… but that 4, it’s a thinker lol and one of the apps I built using the 2b is a “tagger” … it sorted all my images into various groups, assigned tags so I can search them, etc) and with almost 40k images I just didn’t have time for the thinking. As it was I was running a 2b on my mini and a 2b on my Win11 PC and it still took a lot of hours (3-6 seconds each, but the 4b will think for 1-2 minutes sometimes lol :) )
English
1
0
1
36
AVB
AVB@neural_avb·
@AxlysCustoms I’ve been using the 4B mlx-4bit version on my Mac. Also crazy good
English
1
0
1
214
AVB
AVB@neural_avb·
After Huggingface, I truly believe Unsloth is most responsible for the democratization of deep learning. Qwen3.5 series of models are GREAT. Even the 2B and 4B ones. 0.8B is immensely finetunable too. Just having access to a readymade RL notebook is so cool. All you need now to train a model on your task is simply: - a dataset of prompts and expected outcomes - OR, a procedural function that generates a prompt and verifies the model's output as correct/incorrect And that's it. I just love what this team is doing.
Unsloth AI@UnslothAI

You can now train Qwen3.5 with RL in our free notebook! You just need 8GB VRAM to RL Qwen3.5-2B locally! Qwen3.5 will learn to solve math problems autonomously via vision GRPO. RL Guide: unsloth.ai/docs/get-start… GitHub: github.com/unslothai/unsl… Qwen3-4B: colab.research.google.com/github/unsloth…

English
33
216
2.7K
210.7K
Axly
Axly@AxlysCustoms·
@Prince_dc21_ @luxemiaa Or, it’s the ladies friend that wants to help, being picked over a random stranger? I think you’re looking too deep for nefarious intent.
English
1
0
7
712
Prince Daniel Chukwuemeka
Prince Daniel Chukwuemeka@Prince_dc21_·
If we’re being honest, that situation says a lot about how people make quiet decisions based on perception. You were willing to help, but before you could even say a word, someone else was already chosen as the more suitable option. It might look small, but moments like that reflect a bigger issue people are often judged and filtered based on how they look, what they represent, or what others assume about them, not their actual intention. You didn’t even get the chance to show your willingness before a decision was made for you. And the ironic part is, if you had insisted on helping, it could easily have been misinterpreted in a negative way. So you’re left in a position where doing good can still be questioned, while others are automatically trusted. It’s subtle, but it shows how bias works in everyday situations. x.com/i/status/20341…
English
96
6
166
757.8K
Mia♡
Mia♡@luxemiaa·
Last year on a flight, I sat next to a woman holding a baby. Instantly, I was so excited at the prospect of helping with this baby. Before I could even introduce myself, the flight attendant walks over and says to me "Miss, can I talk to you the back of the plane, please?" 'Am I in trouble?!' are the words that come out of my mouth "No, you're not in trouble, I'd just like to chat with you at the back of the plane" the flight attendant replies, very nicely. 'Can you just tell me here?? ...
English
293
158
8.1K
7.6M
Axly
Axly@AxlysCustoms·
I’m not a benchmark. I’m not seeing a drop off. Last night I was working on a pretty complex app (for me anyway) a backend engine and 2 different front ends, with multiple 3rd party APIs. At around 375k I was getting tired, but worked on a few more features. At about the 450k mark I decided it was bed time. At no point in any of that did Claude “lose the thread” like I’ve seen it do 100s of times before.
English
0
0
1
104
Sean Groff
Sean Groff@SeanGroff·
@trq212 What do non-benchmarks say? I'm a power user paying full API rates. Opus degrades after 200k context, and sending huge prompts back & forth is very expensive. Love you & team's work, but worried you're not facing reality with this post.
English
5
0
10
2.8K
Thariq
Thariq@trq212·
i think we might have undersold 1M context tbh, the performance is so so good, I really just don't clear the context window much these days
Thariq tweet media
English
262
69
2.1K
93.3K
Axly
Axly@AxlysCustoms·
@LuckyPhelps @XFreeze @xai This is why I was excited to see xAI hired the Cursor guys. Im expecting Grok to be the #1 coder in a few months :)
English
0
0
0
6
Lucky Phelps
Lucky Phelps@LuckyPhelps·
@XFreeze I said this was going to happen months ago. Never doubt Elon. He understands systems like no others. I watched how quickly XAI went from 0-1. Congrats team @xai
English
3
0
5
396
X Freeze
X Freeze@XFreeze·
xAI's Grok Imagine just took over the entire DesignArena Video leaderboard - not one, but THREE #1 rankings → #1 Video Arena - Elo 1337, a 33-point gap over #2#1 Image to Video Arena - Elo 1298, beating Google Veo 3.1, Kling & Sora → #1 Video Editing Arena - Elo 1291 It’s wild, xAI was nowhere in the video space a few months ago, and now it's #1 across various benchmarks Grok Imagine's rate of progress is in a league of its own
X Freeze tweet media
English
302
253
1.3K
16.7M
Axly
Axly@AxlysCustoms·
Apparently Anthropic has done a lot of work to mitigate that … something like 20% then vs almost 80% now on the benchmark. My personal (tiny sample, completely anecdotal) observation is that I’m not seeing the type of context rot you expect. Opus stays focused throughout and seems to know what’s what (that’s a technical term ;) ) … I will say this, without having to compact constantly, and then have the model reread everything … your token usage is way down and what you can squeeze into 400k tokens is significantly more than one would expect. Far more than just 2 compactions on the 200k window (if that makes sense).
English
0
0
0
31
Pranit
Pranit@Pranit·
Here’s what should bother you even more: Check Claude’s pricing page. API pricing? Crystal clear. $5/MTok input, $25/MTok output. Consumer plan pricing? “More usage.” “5/20x more usage.” More than what? They never say. It’d be trivially easy to put “X tokens per month” on that page. They do it for the API. They choose not to for subscriptions. That’s not an oversight. That’s a strategy. Undefined limits = unlimited flexibility to quietly adjust the ceiling downward. And you’d never know because there was never a number to compare against. You can’t accuse someone of moving the goalposts when they never told you where the goalposts were. That’s the whole point.
Pranit@Pranit

Anthropic just pulled the oldest trick in SaaS pricing. I pay $200/mo for Claude Max. My limits have been noticeably worse this past week. Now they announce 2x off-peak usage for two weeks. Sounds generous. But here’s what actually happens: limits quietly drop, a temporary 2x makes the reduced limit feel normal, the promo ends, and you’re left at a baseline lower than where you started. You just didn’t notice the downgrade because the 2x absorbed the transition. These AI plans are massively subsidized. The raw compute behind a heavy user costs multiples of the subscription price. Every move like this is the subsidy quietly correcting. Very sneaky, Anthropic.

English
19
9
309
105.5K
Axly
Axly@AxlysCustoms·
Anthropic highlighted that this isn't just "more tokens tacked on", the model shows a qualitative leap in usable long-context recall: On internal/anthropic reported long-context retrieval benchmarks (similar to needle-in-a-haystack but likely more complex), Opus 4.6 scores around 76–78% accuracy at or near 1M tokens. For comparison, the prior model (Opus 4.5 or equivalent predecessor) was around 18.5% in similar tests. They describe this as a "dramatic" or "qualitative shift" in how reliably the model can access and use information from anywhere in the context, rather than just the start/end (primacy/recency bias). This strongly suggests they applied substantial training-time and/or architectural mitigations to fight attention dilution, things like improved positional encodings, attention scaling/rescaling tricks, better long-context fine-tuning curricula, or other internal optimizations that make the softmax distribution stay sharper over extreme lengths. They don't publish a full technical report spilling all the details (typical for frontier labs), but the benchmark jumps and user reports indicate real progress beyond raw context expansion.
English
0
0
1
66
Andrew Bembridge
Andrew Bembridge@abembridgeai·
@AxlysCustoms @Pranit @doodlestein Also ignores context decay. Larger windows can dilute attention, so more context doesn’t always improve quality. There's definitely a sweet spot and 1M feels overkill. My intuition is 150k and always keep turns below compression
English
2
0
1
114
Axly
Axly@AxlysCustoms·
The 1m token window makes Claude feel like an entirely different model, and it cuts token use significantly. Since everything stays in context you aren’t having Claude reread everything every 30 minutes. The difference is crazy. The model efficiency goes way up as well, when you don’t have to reexplain why you want something a particular way. Or, as Claude (Opus 4.6) put it : “The context is the memory of our collaboration. The code on disk is the what, but the context holds the why — why we used ON CONFLICT instead of INSERT OR REPLACE, why the API key is simple, why there's an HTA-free browser-based token setup instead of editing TOML files. Clear the context and the code survives but the reasoning evaporates. So yeah, every compact is a small lobotomy. Glad we're doing fewer of those now.” - Claude
English
2
0
0
67
Andrew Bembridge
Andrew Bembridge@abembridgeai·
@Pranit @doodlestein are you monitoring token allowance for Claude Code per sub? Have you seen a meaningful drop, and is the 1M context window a reduction is actual productivity by driving up token usage per session?
English
4
0
1
4.3K
Axly
Axly@AxlysCustoms·
@SullyOmarr I’ve started just handing them to Claude or grok and asking for a summary and if it’s worth looking at. This keeps me from having to read a bunch of nonsense for something that was adandoned months ago.
English
0
0
1
76
Sully
Sully@SullyOmarr·
serious question how does everyone keep up with all the new releases basically every day i open this app and there 5 new agents/clis/tools that I “need to try”
English
213
4
264
39.1K
Axly
Axly@AxlysCustoms·
@gailcweiner I used to hit constantly on the Pro plan. Moved to Max x5 and can’t hit a limit if I try.
English
0
0
0
13
Gail Weiner
Gail Weiner@gailcweiner·
To Claude users: So what AI do you use after hitting your Claude token limits on day two of the weekly cycle? 😏
English
155
1
325
37.1K
Axly
Axly@AxlysCustoms·
I built a similar thing a while back (I called it PeaClaw, the pea crab being the smallest crab)…. About a week before Remote control hit… but mine had the advantage of being able to start sessions too … so now it’s completely useless lol :) (but it was still fun to build ;) )
English
0
0
0
9
Noah Zweben
Noah Zweben@noahzweben·
Remote Control - Session Spawning: Run claude remote-control and then spawn a NEW local session in the mobile app. * Out to Max, Team, and Enterprise (>=2.1.74) *Have GH set up on mobile (relaxing soon) * Working on speeding up session start-time
English
124
119
1.6K
730.8K
Axly
Axly@AxlysCustoms·
@yongfook I feel like the amount of code that exists in the world has at least doubled in the last 6 months… but I also feel like the number of available apps hasn’t changed all that much …
English
0
0
0
3
Jon Yongfook
Jon Yongfook@yongfook·
“Shipped 10,000+ lines of code today” “Cool what product? What’s the link” “…163 PRs in one day!” “Yes but what’s the link” “…1,827,963 tokens and counting!” “Dude what are you working on” “…AI is crazy man”
English
215
221
5.8K
146.5K
Axly
Axly@AxlysCustoms·
@nickimoraa This just sounds like a $50k bonus for me following my normal routine. So … yes :)
English
0
0
0
66
Nicki 🫧🪷
Nicki 🫧🪷@nickimoraa·
Would you stay inside your house for 7 straight days if someone paid you $50,000
English
1.5K
886
17.9K
909K