KKY

202 posts

KKY

@evilpsycho42

Data scientist | Kaggle Global Rank 35th | 分享多模态、LLMs、生成视觉模型的实践和思考

Katılım Nisan 2017

147 Takip Edilen195 Takipçiler

KKY@evilpsycho42·5d

@VictorTaelin 1. Construct a preference(DPO) dataset, you could make it by talking with an agent. Let the agent interview you with two answers, you choose you answer. 2. Train a model through DPO-lora on that dataset. Cheap but useful

English

398

Taelin@VictorTaelin·6d

@evilpsycho42 I don't think that helps, does it?

English

4.4K

Taelin@VictorTaelin·6d

seriously, working with AI is MISERABLE for one and only one reason: having to re-explain the same thing "oh yeah this new session obviously doesn't know what proper case trees are, so let me explain it for the 5000th time in my life" I'm tired AGENTS.md doesn't solve this because it is impossible to fit the entire domain knowledge without nuking the context - it would be 1m+ tokens worth RAGs don't solve this, the agent won't search unknown unknowns SKILLs don't solve this unless I keep like a collection of 1750 skills with specific cuts of domain knowledge for each possible subset of my domain that I might need in a given chat, but that's a lot of manual work recursive LLMs or whatever don't solve this for the same reason, you can't dump a domain book and expect the AGENT will magically guess that it is supposed to search for a specific bit knowledge. unknown unknowns fine tuning doesn't solve this (OSS models suck and OpenAI / Anthropic gave up on user fine tuning) I honestly think a good product around fine tuning on your domain would be a major hit and an underdog lab should take this opportunity

English

665

180

3.5K

250.1K

KKY@evilpsycho42·6d

@xwang_lk DeekSeek with Pi Agent works great for me.

English

1.6K

Xin Eric Wang (hiring postdoc)@xwang_lk·6d

When DeepSeek Code?

English

340

27K

KKY@evilpsycho42·6d

@thsottiaux Cybersecurity warnings happen even when participating in a Kaggle competition ... At least leave an option to summarize current session so users are able to continue their work with other models?

English

Tibo@thsottiaux·6d

Never talk about goblins. Our latest blog is live. openai.com/index/where-th…

English

900

96.4K

KKY@evilpsycho42·29 Nis

@1unyy @0xSero Pi Agent

Indonesia

soul@1unyy·28 Nis

@evilpsycho42 @0xSero Hey, quick question, what CLI are you using in that screenshot?

English

0xSero@0xSero·27 Nis

117.4M tokens for 2.24$ for a genius.

English

123

2.4K

206.7K

KKY@evilpsycho42·28 Nis

5.5 + 4.6 > 5.4 + 4.7 Agree?

English

KKY@evilpsycho42·28 Nis

@ZeroZ_JQ Pi Agent 兄弟，搭配一个ChatGPT plus可以起飞 x.com/evilpsycho42/s…

KKY@evilpsycho42

Leverage a ChatGPT subscription (even the free tier) to fill in Pi Agent's missing web_search and image image generation features. Use it with DeepSeek or Qwen, buckle up, bro.

中文

1.9K

关木@ZeroZ_JQ·28 Nis

DeepSeek 搭配 claude code 好还是搭配 opencode 好？

日本語

38.3K

KKY@evilpsycho42·28 Nis

Leverage a ChatGPT subscription (even the free tier) to fill in Pi Agent's missing web_search and image image generation features. Use it with DeepSeek or Qwen, buckle up, bro.

English

2.1K

KKY@evilpsycho42·28 Nis

@nopmobiel @Prince_Canuma Interesting, what is the memory usage then context goes to 200k?

English

Norbert Schmidt@nopmobiel·27 Nis

DeepSeek-V4-Flash-2bit-DQ works on my M3 max 128GB mac! And with 39 tok/sec quite speedy. Consumes 97GB peak. Thanks a ton for the mlx port and quant @Prince_Canuma #mlx #deepseek

English

737

KKY@evilpsycho42·28 Nis

@NuCode Really impressive, how is the experience with long context (200k)?

English

905

Naoto Nakai@NuCode·28 Nis

Mac（M5Max）速いしQwen3.6-27bと35b-a3bが両方同時に読み込めるのでClaudeCodeのopusとsonnetに27bと35bのモデルを分けてアサインした。快適になるなあ…

Naoto Nakai@NuCode

MacBookPro（M5Max）LMStudioでのMLXモデルの速度 Qwen3.6-35b-a3b-nvfp4　110tok/sec Qwen3.6-35b-a3b-mxfp8　85tok/sec Qwen3.6-27B-nvfp4　30tok/sec Qwen3.6-27B-mxfp8　17tok/sec

日本語

484

72.3K

KKY@evilpsycho42·28 Nis

@runsonai I've recently switched to Pi Agent as well. An agent amazingly suitable for customization.

English

Thanh Pham@runsonai·26 Nis

Very impressed with Pi (coding agent). It's like claude code barebones & customizable. Added custom plan mode: asks opus for plan, then have the option for codex to review it. A workflow I do a lot and now seamlessly integrated. Works w/ Qwen 3.6 as orchestrator.

English

506

KKY@evilpsycho42·28 Nis

@bigeagle_xd Humans are not good enough to judge LLM capabilities based on just a few prompts : )

English

229

熊师傅 weight decay 了吗@bigeagle_xd·27 Nis

glad to see oai proved that arena has problems

Arena.ai@arena

GPT-5.5 by @OpenAI is now live in the Arena, landing across multiple leaderboards. Here’s how it ranks by modality: - Code Arena (agentic web dev): #9, a strong +50pt jump over GPT-5.4 - Document Arena (analysis & long-content reasoning): #6, on par with Sonnet 4.6 - Text Arena: #7, Math #3, Instruction Following: #8 - Expert Arena: #5 - Search Arena: #2 - Vision Arena: #5 Strong, well-rounded performance, especially in Code (+50 pts vs GPT-5.4). Congrats to @OpenAI on the release. Full category breakdowns by modality in the thread.

English

285

29.2K

KKY@evilpsycho42·27 Nis

@0xSero Could you please share your setup of autoresearch for PI agent? A skill or extension?

English

131

0xSero@0xSero·27 Nis

I have heard your advice and will be personally optimising and using Qwen3.6-27B on my Framework. I think these 2 could go great together, 1 for vision and UI the other for logic.

English

6.2K

KKY@evilpsycho42·27 Nis

@DavidOndrej1 @OpenRouter Most random providers seem to excel at only one thing, messing up the models. - wrong chat template - bad inference params - aggressive quantization - bad prompt cache strategy

English

189

David Ondrej@DavidOndrej1·27 Nis

We need more DeepSeek V4 providers on @OpenRouter

English

259

12.2K

KKY@evilpsycho42·27 Nis

优雅的PI Agent + deepseek，加上基于openai oAuth的web_search, image_gen = 舒服

中文

212

KKY@evilpsycho42·27 Nis

@runsonai Nice job! For local LLM inference, which device would you recommend? Mac or DGX spark?

English

202

Thanh Pham@runsonai·27 Nis

Got Qwen 3.6 35B-A3B MoE running at ~65 tok/s (c=1) and ~121 tok/s (c=4) aggregate on my Asus GX10 (dgx spark). Model stack: • Target: Qwen/Qwen3.6-35B-A3B-FP8 - Drafter: z-lab/Qwen3.6-35B-A3B-DFlash • Spec decode: DFlash, 10 speculative tokens • Context: 200k - KV cache: bf16/auto, not fp8 Used vllm for this (see flags below)

English

7.5K

KKY@evilpsycho42·27 Nis

@jun_song how could DeepSeek V4 flash run on a 96g mac? tiny context length & q3 quantized?

English

198

송준 Jun Song@jun_song·27 Nis

If you are using a Mac with 96GB RAM or more: Do not use Qwen3.6 27b. • Minimax M2.7 • Deepseek V4 Flash These two are much faster and smarter. Take advantage of the unified memory.

English

892

65.1K

KKY@evilpsycho42·27 Nis

DeepSeek infra团队简直是神。今天在PI Agent上体验了下DeepSeek v4 flash，ttft 基本在1秒左右，tps 100左右。6M input (96.8% hit cache), 48.2K output, 合计价格 ¥0.36。简直神了缓存命中太稳了，我开了两个sessons，所有的requests中，只有第一条消息和我在/reload 一个extension后的消息没有命中缓存（命中了才奇怪），可以说reuqest维度下缓存命中率100%。

中文

173

KKY@evilpsycho42·27 Nis

@deepseek_ai You've always been setting the trend!

English

DeepSeek@deepseek_ai·26 Nis

🔥DeepSeek Input Cache Price Drop! Effective immediately, the price for input cache hits across the ENTIRE DeepSeek API series is reduced to just 1/10th of the original price! Build more efficiently for less. 📌Reminder: The DeepSeek-V4-Pro 75% OFF promotion is still active until May 5th, 2026, 15:59 (UTC Time).

English

401

735

7.9K

1.3M

Keşfet

@VictorTaelin @xwang_lk @thsottiaux @1unyy @0xSero @ZeroZ_JQ @nopmobiel @Prince_Canuma