Christian Merrill

1.5K posts

Christian Merrill

@M_Chimiste

United States शामिल हुए Mart 2024

98 फ़ॉलोइंग70 फ़ॉलोवर्स

Christian Merrill@M_Chimiste·3h

@kalomaze But it’s for safety. Imagine how unsafe it would be to allow you to map text to tokens, we couldn’t have that.

English

kalomaze@kalomaze·17h

hey isn't it kind of messed up that API customers of Opus 4.7 and Opus 4.8 are paying ~1.41x as much for general english output (when measured against a consistent tokenizer baseline) vs Opus 4.6? from THE ONLY lab that's cagey about releasing the tokenizer... huh. that's... odd

kalomaze@kalomaze

i am trying to work on the closest thing possible to a true "big model smell" eval which is to say: something that measures something that clever post training can't trivially gap, and is cheap + topically diverse i can't test mythos for obvious reasons, but... hmm...

English

216

19.8K

Christian Merrill@M_Chimiste·4h

@keennay I’m in the process….

English

Yannick Nick@keennay·22h

welp I think I'm good to go with open-weight models if the plug ever gets pulled

Citrini@citrini

The risk of the government deciding that a model is too dangerous should only add to the reasons why open source models running on local hardware can be a reasonable alternative.

English

1.3K

Christian Merrill@M_Chimiste·5h

@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai I’m not sure I’ve benchmarked it well enough to tell you but the average latency to a Hermes agent is about 35sec per my dashboard. (I keep a proxy sidecar that counts all tokens and aggregates on my Mac mini) I should configure it to track more thou.

English

Mia@MiaAI_lab·5h

@M_Chimiste @QuixiAI @deepseek_ai @StepFun_ai Insane. How much tok/s when in deep context?

English

Mia@MiaAI_lab·1d

DeepSeek-v4-Flash beats Step-3.7-Flash in head-to-head tool calling benchmark. Full results in: github.com/MiaAI-Lab/Deep…

English

2.8K

Christian Merrill@M_Chimiste·5h

@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai This is how it’s currently configured with the weights being stored on dedicated M.2 drives on the side. I probably should change the configuration since I believe it’s slower with them stacked like this but it’s more convenient space wise.

English

179

Christian Merrill@M_Chimiste·5h

@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai or for smaller models like m2.7 I run two instances

English

Christian Merrill@M_Chimiste·6h

@QuixiAI @MiaAI_lab @deepseek_ai @StepFun_ai If I had a better way to run K2.6 I probably would. Though even though it’s slower than an Nvidia farm, it’s the best I’ve got so I kinda need it still 😅

English

Eric Hartford@QuixiAI·8h

@M_Chimiste @MiaAI_lab @deepseek_ai @StepFun_ai Wanna sell me one? 😁

English

Christian Merrill@M_Chimiste·6h

@xlr8harder Thank you for using the correct abbreviation.

English

xlr8harder@xlr8harder·1d

making fable NOFORN is just part of trump's universal basic honeytrap program

English

2.9K

Christian Merrill@M_Chimiste·6h

@Sentdex Once they are, it'll obviously be added to my list

English

Christian Merrill@M_Chimiste·6h

@Sentdex I don't think GLM 5.2 is available as weights right?

English

221

Harrison Kinsley@Sentdex·7h

While closed source AI is in shambles, open source is having one of the best weeks of all time. Z ai GLM 5.2 Minimax M3 Kimi 2.7 code

Z.ai@Zai_org

Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere. GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans. docs.z.ai/devpack/latest… As our new flagship model, GLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks. API and Chatbot services will launch next week. The model will also be officially open-sourced next week under the MIT License. The future of AI is open, and it belongs to the people.

English

1.2K

70K

Christian Merrill@M_Chimiste·7h

@KeyTryer They absolutely could challenge it but it would also risk undermining their regulatory push. The most similar case I can think of is Bernstein V US on designating cryptography a munition. Though it doesn’t help they’ve been comparing their their to a munition…

English

Key 🗝 🦊@KeyTryer·17h

There's shockingly little discussion about Anthropic's legal recourse about whether the order was illegal or unconstitutional. I don't know much about this, but my guess is that they'll challenge it as soon as possible, just like they challenged the supply chain risk designation.

English

5.4K

Christian Merrill@M_Chimiste·9h

@LLMSherpa Based on that shirt alone I felt compelled to ensure the aura of the shirt matched the aura of the background.

English

Sherpa@LLMSherpa·9h

I want to start a charity called "Beards for Billionaires", because the beard really does smooth over a lot of these problem areas. This is what I look like when I shave my beard.

English

185

Sherpa@LLMSherpa·9h

I feel bad for Dario. Not because of the Mythos drama, but, like... Just fucking look at him. If I asked a caricature artist to draw the biggest, most over-the-top dorky looking guy they could possibly imagine, it wouldn't come close to the reality of Dario. Bless him. 🫶

English

237

Christian Merrill@M_Chimiste·9h

@ivanfioravanti As I think about it maybe if miniaturized might work well for local models on cell phones since they deprecate pretty fast in comparison to say a server.

English

Christian Merrill@M_Chimiste·9h

@ivanfioravanti This is really cool, but I have no idea how well this will scale.

English

Ivan Fioravanti ᯅ@ivanfioravanti·13h

Amazing experiment by Fabio! 🔥

Fabio Guzman@FGuzmanAI

56,000+ tokens/sec at just 80 MHz. 🤯 I burned a full Transformer with KV cache into a custom chip. Designed gate by gate as a 100% digital integrated circuit. Prototyped on a FPGA. (No GPU. No CPU) Just pure digital silicon running @karpathy microGPT, spelling out names on a tiny LCD. This is GateGPT 👇

English

111

15.5K

Christian Merrill@M_Chimiste·9h

@TheAhmadOsman I bought 48Tb of storage yesterday….

English

Ahmad@TheAhmadOsman·18h

Before AGI arrives: Acquire GPUs. Go into debt if you must. But whatever you do, secure the GPUs.

Ahmad@TheAhmadOsman

My house has 33 GPUs. > 21x RTX 3090s > 4x RTX 4090s > 4x RTX 5090s > 4x Tenstorrent Blackhole p150a Before AGI arrives: Acquire GPUs. Go into debt if you must. But whatever you do, secure the GPUs.

English

302

23.4K

Christian Merrill@M_Chimiste·9h

@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai I somehow had the foresight to buy two 512gb Mac studios. Not realizing the world was about to enter a crazy hardware phase.

English

Mia@MiaAI_lab·9h

@M_Chimiste @QuixiAI @deepseek_ai @StepFun_ai I wish I had the compute to run MiniMax M3. For now, DeepSeek-v4-Flash is unbeatable for 2x DGX Spark setup.

English

Christian Merrill@M_Chimiste·9h

@QuixiAI @MiaAI_lab @deepseek_ai @StepFun_ai I had a lot of tool call issues with Step 3.7. I think I was using Q8 at the time in Hermes Agent. I ended up reverting to Minimax M2.7 and working on moving to M3 for the multimodal input.

English

Eric Hartford@QuixiAI·22h

@MiaAI_lab @deepseek_ai Deepseek v4 Flash is text-only, 284B @StepFun_ai Step 3.7 Flash is a Text + Vision model, 198B The vision and the smaller size are more appealing. I choose Step 3.7 Flash.

English

1.1K

Christian Merrill@M_Chimiste·18h

@Sentdex @UnslothAI @MiniMax_AI My friend, I have tokens and time and I want to play with M3 as well. 🤣 I’m working on a bunch of things wrt harnesses and coding and science and this just seems to align.

English

Harrison Kinsley@Sentdex·19h

@M_Chimiste @UnslothAI @MiniMax_AI I dont know if it'll work ootb wit anything but minimax m3 but shouldnt be too hard to allow for others. M3 has some special tokens stuff goin on that im deliberately handling for. Eventually ill make it function with others too maybe. Rly just wanted to play with m3 lol

English

Harrison Kinsley@Sentdex·20h

After spending too many hours trying to implement fixes for MiniMax M3 native tool calls serving via llama.cpp to work in existing agents, I simply had M3 write its own mini coding agent I'm calling: Minion Now my minion just edits itself to give me what I want as a coding agent and it works surprisingly well. Lots of changes I plan to make, feel free to use it if you like...but mostly it has me questioning if we all should just make our own agents at this point. Maybe MiniMax is exceptionally good at tool calls out the gate to make this super simple, but I am enjoying making my minion exactly what I want and nothing more! and it doesn't take 50K context to say "hi" (yet) We'll see how long I can keep my own bloat at bay. Also MiniMax M3 overall so far has me very impressed. This is a VERY cool model!

English

116

7.5K

Christian Merrill रीट्वीट किया

0xSero@0xSero·1d

If you have the NVMe Go download as many models as you think you might ever want. Now, go on Huggingface. They’re coming for open models next.

English

123

1.3K

79.2K

Christian Merrill@M_Chimiste·19h

@Sentdex @UnslothAI @MiniMax_AI I’m going to fork and also add code2lora. Will report back.

English

Harrison Kinsley@Sentdex·20h

Github: github.com/Sentdex/minion MiniMax M3 GGUF models (im using MiniMax-M3-UD-Q4_K_XL): huggingface.co/unsloth/MiniMa… Thanks to @UnslothAI for making these avail so fast and of course to @MiniMax_AI for this epic model and making it open weights!

English

1.6K

खोजें

@kalomaze @keennay @MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai @xlr8harder @Sentdex