Say

491 posts

Say

Say

@SayqinR

Just exploring

Katılım Nisan 2022
250 Takip Edilen14 Takipçiler
Say retweetledi
Dee
Dee@dee_hw·
On-Premise Business AI Center After my posts on the 2-GPU and 4-GPU builds, people reached out asking how to build an 8-GPU box for their businesses. Why? - Protect their IP - Protect customer data - Save on inference costs - Train their own models Here's how to build one: 🧵
Dee tweet media
English
26
32
324
24.3K
Say retweetledi
OpenAI
OpenAI@OpenAI·
Today we’re launching the OpenAI Deployment Company to help businesses build and deploy AI. It's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production for business impact. openai.com/index/openai-l…
English
656
1.5K
11.3K
7.7M
Say retweetledi
青龍聖者
青龍聖者@bdsqlsz·
New open-source image model:HiDream-O1-Image (8B), including both the undistilled and distilled Dev variants, together with the Reasoning-Driven Prompt Agent. As I said, removing VAE is a trend. Two versions:dev 28 steps of inference, and standard version uses 50 steps.
青龍聖者 tweet media青龍聖者 tweet media青龍聖者 tweet media青龍聖者 tweet media
English
15
73
458
129K
Say retweetledi
ModelScope
ModelScope@ModelScope2022·
Tencent HY just released Hy3 preview 👉open source. 295B total, 21B active, 256K context. Hybrid fast-slow thinking MoE. 🚀 First model after a full rebuild of pretraining and RL infrastructure. Biggest gains in coding and agentic tasks. 🛠️ Agent: drives up to 495-step complex workflows in production (docs, data analysis, MCP tool chains) ⚡ Inference: TTFT -54%, end-to-end latency -47%, success rate 99.99%+ on CodeBuddy/WorkBuddy 🎯 Strong on SWE-bench Verified, Terminal-Bench 2.0, BrowseComp, WideSearch — competitive across coding and search agent benchmarks ✅ OpenClaw / OpenCode / KiloCode compatible. vLLM + SGLang supported. 🤖 modelscope.cn/models/Tencent… 💻 github.com/Tencent-Hunyua…
ModelScope tweet mediaModelScope tweet mediaModelScope tweet media
English
12
34
268
24.7K
Say retweetledi
ERNIE for Developers
ERNIE for Developers@ErnieforDevs·
ERNIE 5.1 is here 🚀 ERNIE 5.1 significantly reduces pretraining cost while compressing total parameters to ~1/3 and activated parameters to ~1/2 — using only ~6% of the pretraining cost compared to models at similar scale, while achieving leading performance in its class. 💡Key highlights: 1/ Strong agentic performance approaching leading frontier models. ERNIE 5.1 surpasses DeepSeek-V4-Pro on both τ3-bench and SpreadsheetBench-Verified. 2/ Strong world knowledge and creative writing capabilities, with GPQA and MMLU-Pro performance approaching leading closed-source models, and creative writing ability nearing Gemini 3.1 Pro. 3/ Frontier-level reasoning performance. ERNIE 5.1 scores 99.6 on the challenging AIME26 benchmark with tools, second only to Gemini 3.1 Pro. 4/ Deep search capability. On May 9, ERNIE 5.1 ranked #4 globally and #1 among Chinese models on the Arena Search leaderboard with a score of 1223. ERNIE 5.1 is now available on ERNIE and the Baidu AI Studio Model Playground: 👉ernie.baidu.com 👉aistudio.baidu.com 👉ernie.baidu.com/blog
ERNIE for Developers tweet media
English
54
139
1.2K
233.4K
Say retweetledi
Zyphra
Zyphra@ZyphraAI·
Today we're releasing ZAYA1-8B, a reasoning MoE trained on @AMD and optimized for intelligence density. With <1B active params, it outperforms open-weight models many times its size on math and reasoning, closing in on DeepSeek-V3.2 and GPT-5-High with test-time compute. 🧵
Zyphra tweet media
English
101
295
2.5K
1.3M
Say retweetledi
Sebastian Raschka
Sebastian Raschka@rasbt·
Here is a 2nd batch of April architecture drops. What a month! - Ant Ling 2.6 1T - Minimax M2.7 - Xiaomi MiMo V2.5 - Poolside Laguna XS.2 - Tencent Hy3-preview - IBM Granite 4.1
Sebastian Raschka tweet media
English
20
115
840
39.8K
Say retweetledi
Mistral Vibe
Mistral Vibe@mistralvibe·
Mistral Medium 3.5, a new flagship model in public preview by @MistralAI that merges instruction-following, reasoning, and coding into a single 128B dense model with a 256k context window and configurable reasoning effort. It's a new default model for Mistral Vibe and Le Chat. Released as open weights, under a modified MIT license.
Mistral Vibe tweet mediaMistral Vibe tweet mediaMistral Vibe tweet mediaMistral Vibe tweet media
English
35
81
656
488.4K
Say retweetledi
Nous Research
Nous Research@NousResearch·
The most powerful real-time visual tool in creative coding also has the steepest learning curve Now your Hermes agent can just run TouchDesigner for you. Video credit: made by @macbethAI, a talented AI artist and avid Hermes user, with the TouchDesigner skill
English
104
181
2.5K
255.7K
Say retweetledi
Mushtaq Bilal, PhD
Mushtaq Bilal, PhD@MushtaqBilalPhD·
> be Alexandra Elbakyan > be born in Kazakhstan in 1988 > start coding at 12 > hack your internet provider at 14 > hack MIT Press at 16 to download neuroscience books you can't afford > get a CS degree from Satbayev University > intern in neuroscience at Georgia Tech > speak at Harvard on brain-computer interfaces > notice researchers can't read the papers they need > notice academic publishers charging $30 a paper > notice peer reviewers worked for free > notice editors worked for free > notice universities funded the research with billions of dollars of public money > build Sci-Hub in 2011 > upload nearly every paywalled research paper ever published > give it away for free > get sued by Elsevier > get hit with a $15 million judgment > don't give a flying f*ck > keep Sci-Hub up > get domain after domain seized > register a new one > keep Sci-Hub up > get investigated by the US Department of Justice > don't give a flying f*ck > get accused of working for Russian intelligence > don't give a flying f*ck > have the FBI subpoena your iCloud > get named one of Nature's ten people who mattered in science > get a parasitoid wasp named after you > get a deep-sea snail named after you > get the Electronic Frontier Foundation Award for Access to Scientific Knowledge > become a legend
Mushtaq Bilal, PhD tweet media
English
242
6.9K
34.7K
1.9M
Say retweetledi
Say retweetledi
left curve dev
left curve dev@leftcurvedev_·
Holy shit 2.9% precision lost on UD-IQ3_XXS, the quant I’m using on all my benchmarks! This is insanely good lads, this makes it suitable for daily use and explains all the strong results I shared here in the last couple days 16GB VRAM BROS WE ARE WINNING TODAY! 🥹
Benjamin Marie@bnjmn_marie

Qwen3.6 GGUF Evaluations For the 27B: Q2_K_XL is surprisingly recommendable. IQ3_XXS performs very similarly, uses only +0.2 GB, and generates significantly fewer tokens. If you are memory-tight, pick this one. Otherwise, if you can spare +2.5 GB, use Q3_K_XL: (almost) same accuracy and token efficiency as the original. All the results, also for the 35B, here: kaitchup.substack.com/p/summary-of-q… More results are coming, probably Monday, covering other GGUF providers and some abliterated models.

English
14
14
290
30.2K
Say retweetledi
Pamphlets
Pamphlets@PamphletsY·
🚨🇨🇳 BREAKING — DeepSeek V4 Drops NVIDIA Huawei Ascend Chips Cut AI Costs 100x Open Source Alternative Scores 3206 Rating Near Global Frontier
Pamphlets tweet mediaPamphlets tweet media
English
25
555
4.5K
145.4K