fcan

91 posts

fcan banner
fcan

fcan

@0xfcan

running llms on 3090

İstanbul Katılım Ağustos 2011
259 Takip Edilen252 Takipçiler
fcan
fcan@0xfcan·
gstack reposunu okurken karşılaştığım sistem promptları
fcan tweet media
Türkçe
1
0
2
31
fcan
fcan@0xfcan·
Çoğu fine tune edilmiş modelin baz modele göre zayıf kaldığını görüyorum. Bunun sebebi benchmarklara overfit edilmesinden dolayı olabilir.
Türkçe
0
0
0
6
fcan
fcan@0xfcan·
Nvidia endpoint'indeki belirli modelleri okul veya iş mail adresi ile ücretsiz kullanabiliyorsunuz ancak denediğim tüm modeller çok yavaş. Step 3.5 dışında verim alamadım diyebilirim. API key ve endpoint URL ile istenilen harness'a bağlanılabilir gibi. build.nvidia.com/models
Türkçe
0
0
0
19
fcan
fcan@0xfcan·
@TheAhmadOsman waiting for more people to get AhmadOsman-pilled
English
0
0
0
9
fcan
fcan@0xfcan·
1x3090'da Hermes+Qwen 3.5 27b. Harness olarak Opencode. Dario'nun veya Sam'in şirket içi maillerini takip etmek istemiyorsanız en iyi f/p setuplardan biri bu şu anda :)
Türkçe
0
0
0
76
Lotto
Lotto@LottoLabs·
@JordanStev Install llama.cpp right now and download qwen 27b
English
4
0
43
1.2K
Jordan Stevenson
Jordan Stevenson@JordanStev·
Using Sonnet 4.6 for a Hermes Agent --- just spent $1.40 in API costs to ask it what the weather is like in Marbella today Not sure it's going to be cost-saving to route 10,000 monthly customer support conversations to this then 😄
English
6
0
13
1.3K
fcan
fcan@0xfcan·
@0xSero denemek isterim!
Türkçe
0
0
0
16
0xSero
0xSero@0xSero·
Do you want to try Droid? I’m doing a giveaway 3 people will win 100M Factory credits each.Thats 5 months of their 20$ a month subscription. Winners selected randomly from comments in 48 hours.
0xSero tweet media
English
1.1K
36
798
80.9K
fcan
fcan@0xfcan·
@loktar00 llama.cpp is the king
English
0
0
1
185
Loktar 🇺🇸
Loktar 🇺🇸@loktar00·
llama.cpp hit 100k stars.... honestly one of the most important projects in the local AI space. This is what made running models on your own hardware actually viable. If you're still routing everything through Ollama you're leaving performance on the table.
English
6
4
86
7.6K
fcan
fcan@0xfcan·
Prompt girişi sonrası GPU'dan gelen hafif uğultu inference almanın donanım maliyetini bağlamına oturtuyor
Türkçe
0
0
0
42
fcan
fcan@0xfcan·
@LottoLabs thank god somebody quantized it :)
English
0
0
0
196
Lotto
Lotto@LottoLabs·
Just need like 6x6000pro for this
Lotto tweet media
English
10
1
42
5K
fcan
fcan@0xfcan·
@LottoLabs 3090 and a dream is a great phrase👍
English
0
0
3
187
Lotto
Lotto@LottoLabs·
I want to reiterate my tests are for brokies w/ a 3090 and a dream
English
11
1
96
6.2K
fcan
fcan@0xfcan·
3090 Ti'da test etmeye başladığım modeller: 1) Qwen 3.5 & türevleri (Opus distilled,uncensored vb.) 2) nvidia/Nemotron-Cascade-2-30B-A3B 3) vngrs-ai/Kumru-2B 4) Sevgili @AlicanKiraz0 'dan /Kara-Kumru-v1.0-2B 5) Oto.tamamlama için zed-industries/zeta-2 (@0xSero 'nun önerisi ile)
Türkçe
0
0
1
212
fcan
fcan@0xfcan·
@0xSero brilliant list! having fun with qwen 3.5 27b and nemotron cascade 2 30b!
English
0
0
0
348
0xSero
0xSero@0xSero·
Best models to run on your hardware level I'll be doing this every week, I hope you guys enjoy. ---- 8 GB ---- Autocomplete for coding (like Cursor Tab) - huggingface.co/NexVeridian/ze… - huggingface.co/bartowski/zed-… Tool calling, assistant style - huggingface.co/nvidia/NVIDIA-… ---- 16 Gb ---- Here things get better: Multimodal - huggingface.co/Qwen/Qwen3.5-9B - huggingface.co/Tesslate/OmniC… - huggingface.co/unsloth/Qwen3.… ---- 24 GB ---- - The best model you can get (thanks Qwen) huggingface.co/Qwen/Qwen3.5-2… - Great model (strong agents) huggingface.co/nvidia/Nemotro… - Mine hehe huggingface.co/0xSero/Qwen-3.… I'm doing a weekly series
English
219
367
3.7K
582.4K
fcan
fcan@0xfcan·
@sudoingX 'in önerisi ile Qwen3.5-27B q4'ü aşağıdaki argları kullanarak 1x3090'da çalıştırıyorum. llama-server -ngl 99 -c 262144 -fa on --cache-type-k q4_0 --cache-type-v q4_0 Modeli @NousResearch Hermes agentı'na bağladım. Birkaç debug işlemi yaptım. İlk izlenimim çok olumlu
fcan tweet media
Türkçe
0
1
0
83
0xSero
0xSero@0xSero·
Please list all the apps you know of this category that's not already on the list: 1. Cursor Glass 2. Factory Desktop 3. Codex App 4. OpenCode App 5. Claude Desktop 6. CMUX I am looking for this mythic ADE: 1. Has browser 2. Has filesystem 3. Has BYOK 4. Has good ui
English
78
7
403
88.2K
fcan
fcan@0xfcan·
Token/inference talebi gittikçe artarken YZ yatırım balonu olduğunu iddia edenlere inanmıyorum. Ve tabi ki lokal llm çalıştırıp kendinizi bu bağlamdan kurtarmak büyük rahatlık..
Thariq@trq212

To manage growing demand for Claude we're adjusting our 5 hour session limits for free/Pro/Max subs during peak hours. Your weekly limits remain unchanged. During weekdays between 5am–11am PT / 1pm–7pm GMT, you'll move through your 5-hour session limits faster than before.

Türkçe
0
0
0
89
Alican Kiraz
Alican Kiraz@AlicanKiraz0·
Guys, presenting the World's First Heterogeneous @NVIDIAAIDev CUDA + @Apple Metal Distributed Inference Cluster 🎉 I managed to get @nvidia CUDA + Apple Metal running distributed LLM inference together — same pipeline, same ring. I owe this to exo's amazing infrastructure. ❤️ The BF16 model splits across architectures — layers 0-10 on NVIDIA, layers 10-32 on Apple Silicon. I achieved this by adding just ~200 lines across 5 files to exo's brilliant open-source codebase. PR to @exolabs coming soon.
Alican Kiraz tweet media
English
23
26
200
47.2K