Christian Merrill

1.5K posts

Christian Merrill banner
Christian Merrill

Christian Merrill

@M_Chimiste

United States शामिल हुए Mart 2024
98 फ़ॉलोइंग70 फ़ॉलोवर्स
Christian Merrill
Christian Merrill@M_Chimiste·
@kalomaze But it’s for safety. Imagine how unsafe it would be to allow you to map text to tokens, we couldn’t have that.
English
0
0
0
14
kalomaze
kalomaze@kalomaze·
hey isn't it kind of messed up that API customers of Opus 4.7 and Opus 4.8 are paying ~1.41x as much for general english output (when measured against a consistent tokenizer baseline) vs Opus 4.6? from THE ONLY lab that's cagey about releasing the tokenizer... huh. that's... odd
kalomaze tweet media
kalomaze@kalomaze

i am trying to work on the closest thing possible to a true "big model smell" eval which is to say: something that measures something that clever post training can't trivially gap, and is cheap + topically diverse i can't test mythos for obvious reasons, but... hmm...

English
12
5
216
19.8K
Christian Merrill
Christian Merrill@M_Chimiste·
@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai I’m not sure I’ve benchmarked it well enough to tell you but the average latency to a Hermes agent is about 35sec per my dashboard. (I keep a proxy sidecar that counts all tokens and aggregates on my Mac mini) I should configure it to track more thou.
English
0
0
0
42
Mia
Mia@MiaAI_lab·
DeepSeek-v4-Flash beats Step-3.7-Flash in head-to-head tool calling benchmark. Full results in: github.com/MiaAI-Lab/Deep…
Mia tweet media
English
10
0
35
2.8K
Christian Merrill
Christian Merrill@M_Chimiste·
@MiaAI_lab @QuixiAI @deepseek_ai @StepFun_ai This is how it’s currently configured with the weights being stored on dedicated M.2 drives on the side. I probably should change the configuration since I believe it’s slower with them stacked like this but it’s more convenient space wise.
Christian Merrill tweet media
English
0
0
2
179
xlr8harder
xlr8harder@xlr8harder·
making fable NOFORN is just part of trump's universal basic honeytrap program
English
2
1
16
2.9K
Harrison Kinsley
Harrison Kinsley@Sentdex·
While closed source AI is in shambles, open source is having one of the best weeks of all time. Z ai GLM 5.2 Minimax M3 Kimi 2.7 code
Z.ai@Zai_org

Intelligence should be open, accessible, and ready to build with, empowering every developer, everywhere. GLM-5.2 is now available to all GLM Coding Plan users, including Lite, Pro, Max, and Team plans. docs.z.ai/devpack/latest… As our new flagship model, GLM-5.2 delivers powerful coding capabilities, usable 1M-context support, and continued strengths in long-horizon tasks. API and Chatbot services will launch next week. The model will also be officially open-sourced next week under the MIT License. The future of AI is open, and it belongs to the people.

English
42
70
1.2K
70K
Christian Merrill
Christian Merrill@M_Chimiste·
@KeyTryer They absolutely could challenge it but it would also risk undermining their regulatory push. The most similar case I can think of is Bernstein V US on designating cryptography a munition. Though it doesn’t help they’ve been comparing their their to a munition…
English
0
0
1
75
Key 🗝 🦊
Key 🗝 🦊@KeyTryer·
There's shockingly little discussion about Anthropic's legal recourse about whether the order was illegal or unconstitutional. I don't know much about this, but my guess is that they'll challenge it as soon as possible, just like they challenged the supply chain risk designation.
English
13
2
85
5.4K
Christian Merrill
Christian Merrill@M_Chimiste·
@LLMSherpa Based on that shirt alone I felt compelled to ensure the aura of the shirt matched the aura of the background.
Christian Merrill tweet media
English
1
0
1
10
Sherpa
Sherpa@LLMSherpa·
I want to start a charity called "Beards for Billionaires", because the beard really does smooth over a lot of these problem areas. This is what I look like when I shave my beard.
Sherpa tweet media
English
2
0
3
185
Sherpa
Sherpa@LLMSherpa·
I feel bad for Dario. Not because of the Mythos drama, but, like... Just fucking look at him. If I asked a caricature artist to draw the biggest, most over-the-top dorky looking guy they could possibly imagine, it wouldn't come close to the reality of Dario. Bless him. 🫶
English
1
0
5
237
Christian Merrill
Christian Merrill@M_Chimiste·
@ivanfioravanti As I think about it maybe if miniaturized might work well for local models on cell phones since they deprecate pretty fast in comparison to say a server.
English
0
0
1
16
Christian Merrill
Christian Merrill@M_Chimiste·
@QuixiAI @MiaAI_lab @deepseek_ai @StepFun_ai I had a lot of tool call issues with Step 3.7. I think I was using Q8 at the time in Hermes Agent. I ended up reverting to Minimax M2.7 and working on moving to M3 for the multimodal input.
English
1
0
1
37
Eric Hartford
Eric Hartford@QuixiAI·
@MiaAI_lab @deepseek_ai Deepseek v4 Flash is text-only, 284B @StepFun_ai Step 3.7 Flash is a Text + Vision model, 198B The vision and the smaller size are more appealing. I choose Step 3.7 Flash.
English
2
2
10
1.1K
Christian Merrill
Christian Merrill@M_Chimiste·
@Sentdex @UnslothAI @MiniMax_AI My friend, I have tokens and time and I want to play with M3 as well. 🤣 I’m working on a bunch of things wrt harnesses and coding and science and this just seems to align.
English
1
0
1
18
Harrison Kinsley
Harrison Kinsley@Sentdex·
@M_Chimiste @UnslothAI @MiniMax_AI I dont know if it'll work ootb wit anything but minimax m3 but shouldnt be too hard to allow for others. M3 has some special tokens stuff goin on that im deliberately handling for. Eventually ill make it function with others too maybe. Rly just wanted to play with m3 lol
English
1
0
1
37
Harrison Kinsley
Harrison Kinsley@Sentdex·
After spending too many hours trying to implement fixes for MiniMax M3 native tool calls serving via llama.cpp to work in existing agents, I simply had M3 write its own mini coding agent I'm calling: Minion Now my minion just edits itself to give me what I want as a coding agent and it works surprisingly well. Lots of changes I plan to make, feel free to use it if you like...but mostly it has me questioning if we all should just make our own agents at this point. Maybe MiniMax is exceptionally good at tool calls out the gate to make this super simple, but I am enjoying making my minion exactly what I want and nothing more! and it doesn't take 50K context to say "hi" (yet) We'll see how long I can keep my own bloat at bay. Also MiniMax M3 overall so far has me very impressed. This is a VERY cool model!
Harrison Kinsley tweet media
English
10
7
116
7.5K
Christian Merrill रीट्वीट किया
0xSero
0xSero@0xSero·
If you have the NVMe Go download as many models as you think you might ever want. Now, go on Huggingface. They’re coming for open models next.
0xSero tweet media
English
123
85
1.3K
79.2K