Samuel Fajreldines

71 posts

Samuel Fajreldines banner
Samuel Fajreldines

Samuel Fajreldines

@devindolar

🌐 https://t.co/JYoZPd0pNR

Katılım Şubat 2024
37 Takip Edilen0 Takipçiler
@levelsio
@levelsio@levelsio·
Are you guys aware I am coding mostly on my phone now all day via Termius to Claude Code on my server while I go with gf to the dentist, clothing store, cafe, etc. 😛✌️
@levelsio tweet media
rootkid ✌️@rootkid

@levelsio "You" ➡️ IP your Internet provider assigns you; not your servers IPs. If you had a static IP I'd like to know why you prefer Tailscale over just adding e.g. your company IP to the firewalls SSH whitelist.

English
320
88
2.1K
681.2K
Samuel Fajreldines retweetledi
dev
dev@zivdotcat·
> be nvidia > make $11B a year in 2020 > build GPUs for gamers > AI boom starts > suddenly every company needs 100k+ GPUs > OpenAI, Google, Meta, Amazon, xAI - all placing orders > deepseek drops r1 > stock loses $600B in a day > everyone says it’s over > recover in two weeks > hit new highs > Jensen walks on stage at GTC > says there’s $1T in orders for Blackwell + Vera Rubin chips by 2027 >revenue path: $11B → $27B → $61B → $131B → $216B > hyperscalers about to spend $700B on data centers this year become the pickaxe seller of the AI gold rush
Watcher.Guru@WatcherGuru

JUST IN: Nvidia $NVDA CEO Jensen Huang expects revenue to surpass $1 trillion by 2027.

English
35
98
5.1K
1.1M
Okara
Okara@askOkara·
Today we're introducing the world's first AI CMO. Enter your website and it deploys a team of agents to help you get traffic and users. Try it now at okara.ai/cmo
English
1.6K
2.4K
27.6K
13.7M
Jorge Castillo
Jorge Castillo@JorgeCastilloPr·
There is a CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. It filters and compresses command outputs before they reach your LLM context 🧞‍♂️ Link in comments.
Jorge Castillo tweet media
English
2
2
16
2.9K
clanker
clanker@clanker_·
@thsottiaux - fast mode isn't fast enough🥲 - need a first-party ios companion app, even if minimal - i know im in the extreme minority here, but imo 5.3 codex's personality is the best AI personality, & is the biggest reason I'm using Codex/GPT over Claude -- need that back in 5.5, 6, etc.
English
2
0
5
1.8K
Tibo
Tibo@thsottiaux·
What are we consistently getting wrong with codex that you wish we would improve / fix?
English
1.2K
14
874
142.2K
Stan Kirdey
Stan Kirdey@stan_info·
@thsottiaux would love to have remote control ability so I can talk with a session running on the desktop while I am away, but it needs to work better than Claude Code remote
English
1
0
3
479
Charlie Lamb
Charlie Lamb@charlietlamb·
Introducing OpenLogs. A key part of my local dev stack. Stop copy & pasting your terminal logs into an agent. Now just prefix your command with ol and your agent can see everything. Log visibility was a huge bottleneck for local development - agents were unaware when something isn't working. Now they can just check the logs themselves and fix - no more human involvement. Give it a try and let me know any feedback!
English
51
52
901
90.9K
Sk Akram
Sk Akram@akramcodez·
Unpopular opinion: A Mac Mini makes more sense than a MacBook if you work from one place.
Sk Akram tweet media
English
191
45
2.7K
251.1K
Samuel Fajreldines
Samuel Fajreldines@devindolar·
@nix_eth Have you tried Qwen3.5-27B dense? I've bought the same Mac. So excited to receive it! What system are you using? oMLX?
English
4
0
4
1.3K
nix.eth
nix.eth@nix_eth·
LLM speed on my MacBook M5 Max (128GB): • Qwen3.5-35B-A3B (Q6): 74 tok/s • Nemotron-3 Super (Q4): 24 tok/s • Qwen3-Coder-Next (6-bit): 67 tok/s • Llama 3.3 8B Instruct (Q4): 99 tok/s On my M1, Llama 3.3 was my go-to for most local tasks at about 20 tok/s. On the M5 Max, it's hitting 99 tok/s. Qwen3.5 feels like a huge upgrade and is my favorite so far. Qwen3-Coder-Next is surprisingly good at dev tasks, although I'll probably stick with GPT-5.4 for most. I'm also impressed by Nemotron-3 Super, but its personality feels a bit too dry.
nix.eth tweet medianix.eth tweet medianix.eth tweet medianix.eth tweet media
English
85
74
961
76K
David Hendrickson
David Hendrickson@TeksEdge·
💵Home Inferencing Cost Comparison Running For 1 Day. 🏠 Personal LLM 🖥️ DGX-Spark Clone ($3K Asus) 🤖 Qwen3.5 27B @ 30 tps ⏲️ 24 hours 🪙 2.6M tokens Cost = $0.30 of electricity 🏭 BigAI 🤖 Sonnet 4.6 @ 55tps (my experience) ⏲️ 13 hours 🪙 2.6M tokens Cost = $39 in tokens
English
80
26
626
69.5K
am.will
am.will@LLMJunky·
i desperately need tibo to do something about this. 38% left for six days on a $200 Pro account 😭 ngmi
am.will tweet media
English
49
1
201
22.5K
George Pickett
George Pickett@georgepickett·
burned 31% of a chatgpt pro sub in one 5hr session not looking forward to April 2 when 2x rate limits go away.
English
61
7
641
56.9K
Victor
Victor@VictorGulchenko·
@digitalix M5 ultra Mac Mini with 1.5 terabyte of RAM
Polski
3
0
10
2.5K
Alex Ziskind
Alex Ziskind@digitalix·
M5 Max Mac Mini. Thoughts?
Alex Ziskind tweet media
English
121
15
677
152.5K
Samuel Fajreldines retweetledi
Unsloth AI
Unsloth AI@UnslothAI·
Learn how to run Qwen3.5 locally using Claude Code. Our guide shows you how to run Qwen3.5 on your server for local agentic coding. We then build a Qwen 3.5 agent that autonomously fine-tunes models using Unsloth. Works on 24GB RAM or less. Guide: unsloth.ai/docs/basics/cl…
Unsloth AI tweet media
English
95
359
2.9K
226.8K
Sudo su
Sudo su@sudoingX·
spent the entire day testing Qwopus (Claude 4.6 Opus distilled into Qwen 3.5 27B) on a single RTX 3090 through Claude Code. this is my new favourite to host locally. no jinja crashes. thinking mode works natively. 29-35 tok/s. 16.5 GB. the harness matches the distillation source and you can feel it. the model doesn't fight the agent. my flags: llama-server -m Qwopus-27B-Q4_K_M.gguf -ngl 99 -c 262144 -np 1 -fa on --cache-type-k q4_0 --cache-type-v q4_0 if you want raw speed, base Qwen 3.5 MoE still wins at 112 tok/s. but for autonomous coding where the model needs to think, wait for tool outputs, and selfcorrect without stalling, Qwopus on Claude Code is the cleanest setup i've found on this card. i want to see what everyone else is running. drop your GPU, model, harness, flags, and tok/s below. doesn't matter if it's a 3060 or a 4090, nvidia or amd. configs help everyone. let's push these cards to their ceilings. let's make this thread the reference.
Sudo su tweet media
Sudo su@sudoingX

Qwopus on a single RTX 3090. Claude Opus 4.6 reasoning distilled into Qwen 3.5 27B dense, running through Claude's own coding agent (claude code). 29-35 tok/s with thinking mode on. the jinja bug that kills thinking on base Qwen doesn't carry over. harness and model matched. the base model would pause mid task on Claude Code. just stop generating. that's why i ran it through OpenCode, which handles stalled states automatically. this distilled version doesn't stall. it waits for tool outputs, reads them, selfcorrects when something breaks, and keeps going. i gave it a benchmark analysis task. went 9 minutes autonomous. wrote a README nobody asked for. zero steering. video is 5x speed but fully uncut. if you have a 3090, you can run this right now. free. no API. no subscription. opus structured reasoning on localhost. octopus invaders is next. same prompt that base qwen passed in 13 minutes and hermes 4.3 failed on 2x the hardware. i want to see if the distillation changes the outcome or just the style. more data soon.

English
111
181
2.5K
262.6K