Cossale — oss/acc

320 posts

Cossale — oss/acc

@XCossale

Working with LLMs and Diffusion models. prev - @revancedapp. Available for contracts. @KeplerSystems

India Bergabung Nisan 2018

537 Mengikuti123 Pengikut

Tweet Disematkan

Cossale — oss/acc@XCossale·9 Şub

I cooked. Go check it out!

Kepler Systems@KeplerSystems

1/5 Launching Kepler Systems—an AI research initiative where I build open models and share datasets. Starting with datasets I’ve curated to preserve Indo-Pak Urdu poetry’s richness. No corporate labs here—just a solo researcher and a lot of love for language and art.

English

818

Cossale — oss/acc@XCossale·4 Nis

Pi agent core by @badlogicgames makes it so much easier to integrate LLMs into apps without having to create custom harnesses

English

120

Cossale — oss/acc@XCossale·3 Nis

@mweinbach Opus helps me create patches for @revancedapp. Frontier LLMs are really good at reversing now. Need to test with Kimi now

English

714

Max Weinbach@mweinbach·2 Nis

What do we think? Will this be successful?

English

13.2K

Cossale — oss/acc@XCossale·10 Mar

@norpadon It's also very very good at audio understanding. Competes fairly with Gemini 2.5 Pro (beating flash!). Unfortunate that it's limited to only 30 seconds.

English

Artur Chakhvadze@norpadon·10 Mar

Gemma3n is still by far the most galaxy-brained edge architecture and it’s not even close It’s a shame that almost nobody gives a damn about it because they never released a paper and its a pain in the ass to implement properly

English

120

12.1K

Cossale — oss/acc@XCossale·10 Mar

@TheZachMueller llama 3.2 3b (now resurrected by granite models)

English

Zach Mueller@TheZachMueller·10 Mar

Every AI researcher has that one model they love deeply, even if it’s outdated

English

7.7K

Cossale — oss/acc@XCossale·24 Şub

@maximelabonne At least you got an excuse to not compare with latest models haha

English

332

Maxime Labonne@maximelabonne·24 Şub

Releasing a 24B-A2B model on the same day as Qwen3.5-35B-A3B is NOT great timing 🥲

Qwen@Alibaba_Qwen

🚀 Introducing the Qwen 3.5 Medium Model Series Qwen3.5-Flash · Qwen3.5-35B-A3B · Qwen3.5-122B-A10B · Qwen3.5-27B ✨ More intelligence, less compute. • Qwen3.5-35B-A3B now surpasses Qwen3-235B-A22B-2507 and Qwen3-VL-235B-A22B — a reminder that better architecture, data quality, and RL can move intelligence forward, not just bigger parameter counts. • Qwen3.5-122B-A10B and 27B continue narrowing the gap between medium-sized and frontier models — especially in more complex agent scenarios. • Qwen3.5-Flash is the hosted production version aligned with 35B-A3B, featuring: – 1M context length by default – Official built-in tools 🔗 Hugging Face: huggingface.co/collections/Qw… 🔗 ModelScope: modelscope.cn/collections/Qw… 🔗 Qwen3.5-Flash API: modelstudio.console.alibabacloud.com/ap-southeast-1… Try in Qwen Chat 👇 Flash: chat.qwen.ai/?models=qwen3.… 27B: chat.qwen.ai/?models=qwen3.… 35B-A3B: chat.qwen.ai/?models=qwen3.… 122B-A10B: chat.qwen.ai/?models=qwen3.… Would love to hear what you build with it.

English

528

56.9K

Cossale — oss/acc@XCossale·24 Şub

@vega_holdings @max_paperclips Google literally disabled my 10 year old google account because nano banana incorrectly flagged an image. There is no recovery process btw

English

ｖｅｇａ@vega_holdings·24 Şub

@max_paperclips this fucking terrifies me for having a google account that's decades old that if i tell gemini to stop being a retard on day they will delete my account

English

269

Shannon Sands@max_paperclips·24 Şub

all these services hate people using them with agents. it's kinda amazing to watch happen. guess Google want to offer their own competing agent soon

Luke The Dev@iamlukethedev

Was running OpenClaw connected to my Gmail. Out of nowhere, Google disabled the account. Trying to understand what triggered it. Was this a configuration issue on my side or something others have seen when automating Gmail with agents?

English

7.4K

Cossale — oss/acc@XCossale·23 Şub

@Teknium @teortaxesTex Because there is no serious alternative at the moment. Gemini is basically your only **reliable** option for multi-modality and it gets even harder for non-English tasks.

English

Teknium 🪽@Teknium·23 Şub

@teortaxesTex What does the multimodality get them long term - products mostly or do you see an RSI angle to multimodality focus

English

771

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·22 Şub

Contra Lisan I don't think this is benchmaxing, btw. Indeed I suspect that Gemini's performance on procedural generation of graphics (it goes beyond SVG) is a product of its deep agvantages in multimodality, and it's GDM's long-term bet. Like OAI's reasoning, and Ant's coding.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media

Design Arena@Designarena

BREAKING: Gemini 3.1 Pro Preview has landed in #1 on SVG Arena by Design Arena with an ELO of 1421 This 87-point lead the largest winning margin that we've seen a model have on SVG Arena since the arena launch Huge congratulations to the @GoogleDeepMind team!

English

6.3K

Cossale — oss/acc@XCossale·19 Şub

@AlpinDale @yacineMTB From where?

English

118

Alpin@AlpinDale·19 Şub

@yacineMTB IDK I rent 5090s for about 20 cents an hour.

English

1.5K

kache@yacineMTB·19 Şub

bruh fr?

stricture@bog_beef

5090s are now 4 grand, double the msrp. This is my final warning to the AI companies. Submit now and gamers may be merciful

English

172

20K

Cossale — oss/acc@XCossale·19 Şub

@anametolast @dhtikna Qwen 3 235B hosted by W & B is probably better deal than both of those for $.10 and in and output. Using it alot recently for synth data. Around the same performance as gpt oss 120b but half the price and bf16. Also hi ktibow, long time no see after revanced!

English

101

KTibow@anametolast·18 Şub

@dhtikna this is untrue.

English

127

Ankith 🐋/acc@dhtikna·18 Şub

Complete MOGGING by Deepseek, 673B whale costs the same as 120B gpt-oss, no competition by a mile in architechture efficiency Remember they make $0.80 in profit per $1.00

English

146

17.2K

Cossale — oss/acc@XCossale·18 Şub

@Dorialexander @AmpCode Opencode Zen with free models

English

210

Alexander Doria@Dorialexander·18 Şub

So I want with @AmpCode and, unfortunately… (and no, most of my students did not create accounts in time). What is the best recommended free(mium) alternative now?

Alexander Doria@Dorialexander

What is the recommended free/freemium alternative to claude code? GLM on OpenCode? Codex works with free gpt? (for my students, so that they can start without subscription).

English

10.7K

Cossale — oss/acc@XCossale·12 Şub

@YouJiacheng @Zai_org I think the model is supposed to move the file to another folder so it can be downloaded but it doesn't do it half the time. Tell the model that you can't see the file and it should move it correctly. Had the same issues with claude site few months ago

English

You Jiacheng@YouJiacheng·12 Şub

holyshit, the agent powered by GLM-5 generate a PDF for me but there is no way to download it??? who is the PM??? @Zai_org chat.z.ai/s/5ca0fef9-fb4…

You Jiacheng@YouJiacheng

who designed this UI/UX??? WHY zai use more space to show a meaningless animation than the terminal??? AND I can't adjust the height of terminal??? AND I can't scroll up the terminal when the agent is running??? (it will be immediately scrolled down) @Zai_org @jietang

English

2.9K

Cossale — oss/acc@XCossale·11 Şub

@Dorialexander what's the most broken part of the harness side right now?

English

109

Alexander Doria@Dorialexander·11 Şub

Data annotation is probably one of the few areas where the harness could be the product right now. Models have the capacity, even the meta-skill to orchestrate subagents for full pipelines, just badly controlled and implemented.

English

Cossale — oss/acc@XCossale·11 Şub

@dhtikna @eliebakouch It's crazy fast even on cpu

English

Ankith 🐋/acc@dhtikna·11 Şub

@XCossale @eliebakouch What do you like about oss arch?

English

elie@eliebakouch·11 Şub

i think we don't realize the impact that deepseek had on the open ecosystem, there is so much from them that you can find in almost every frontier open llm today > most of the open frontier models follow the "finegrain + sparse + shared expert" deepseek moe recipe > a lot of them use MLA > first (with minicpm) to use sparse attention in prod (DSA) > first to do reasoning in the open with R1 > GRPO which is the foundation for most of the newer RL algorithms > they also innovated on the training recipe at scale, first to do fp8? MTP? load balancing schemes that now other lab is using > advance training/inference infra with oss release like DeepEP that pretraining lib like megatron use i'm so grateful deepseek exists

English

291

25.6K

Cossale — oss/acc@XCossale·22 Oca

@0xSero @BinxNet @fatihozgen85 You can use your ai-data-extraction repo and then filter by Edit tool to get FIM dataset. Here is my test on `Qwen3-0.6B` with just ~4k rows and 5 minutes of training.

English

0xSero@0xSero·22 Oca

@BinxNet @fatihozgen85 I thought they trained them to do FIM

English

284

0xSero@0xSero·21 Oca

I just had to make a new video of GLM-4.7-Flash - Helping me refactor VLLM studio - Did a data analytics report for work - Managed to search my tweets - Made me a fully playable Pacman in 1 shot - Great at browser use This model is too good to be this small, the full thing will fit on a macbook, it's fast, precise and can do pretty much anything Sonnet can. It somehow benches higher than Sonnet 4, 3.7, Opus 4.5 no thinking, all the GPT models from last year and more. youtu.be/_SDyaPYmIxU

YouTube

English

324

38.2K

Cossale — oss/acc@XCossale·20 Oca

@kalomaze @celestepoasts My main issue with GLM models is that the instruction following is kind of bad. I get to test all models at work, but I would prefer to use Minimax 2.1 due to better instruction following and consistency, despite it being less smart than GLM models. Same with Gemini models.

English

kalomaze@kalomaze·19 Oca

@celestepoasts glm models are usually quaint, if not exactly super polished on the post training side, i never really get "its super burnt" feelings about them

English

986

kalomaze@kalomaze·19 Oca

this is so weird to me because sonnet3.6 was ~50 on swebench before they really started going all in on agents. there really is a lot of low hanging fruit left, even for the mid range ~30b total params models with more post training iteration

elie@eliebakouch

GLM team is now using MLA!! this is pretty insane model with 30B total param and about 4B active. very nice release in terms of structure it's approximatively the same depth as glm4.5 air and qwen3 30B A3B, 64 total expert instead of 128, but they only active 5 instead of 9 if you count the shared expert

English

211

25.2K

Cossale — oss/acc@XCossale·11 Oca

@0xSero Extracted all my sessions. Time to fine-tune an auto-complete model

English

552

0xSero@0xSero·11 Oca

Here's the dumbest engineering idea to get GLM-4.7 at 32GB VRAM Make skills out of all your most successful AI chats. Store the sessions under that skill, link every chat that uses that skill. You can also do this to refine and personalize small models. If you're using this tech enough you will easily rack up 100s of billions of tokens (training data.) - Take a big open model, anyone of them is fine. - Take all your skills you've accumulated along with all the session chats. - Organize very very well, or ask GPT/Claude to lol. - Now you have a calibration dataset. - Rent a 8x H100/H200 pod for 200$ - Prune your desired model around your calibration set. - It'll be dumb as a brick in most places, but you retain full fidelity of activations. - Use the big models to build a control board - Click button to get your macbook to do your workflows. No reason this shouldn't work.

English

334

23.8K

Cossale — oss/acc@XCossale·11 Oca

@eliebakouch @huggingface I found llama 4 maverick to very competitive with gemma 27b in translations (I mainly work with Indic languages).

English

252

elie@eliebakouch·10 Oca

Most web data in (very) low resource languages is Bible and Wikipedia. The rest? @huggingface data team ran Gemma3 27B for 3 months to translate it into english, to improve translation models and to bring cultural context from 500+ language communities into english training data. Here is the full pipeline huggingface.co/datasets/Huggi…

Guilherme Penedo@gui_penedo

We are releasing a large scale synthetic dataset: 💬FineTranslations. We took 🥂 FineWeb2, our multilingual pre-training dataset, and translated it into English using Gemma3 27B. The result is a massive parallel corpora, with more than 1 trillion tokens!

English

192

37.1K

Cossale — oss/acc@XCossale·7 Oca

@MaziyarPanahi OpenMed models are amazing! Created a frontend for NER.

English

Cossale — oss/acc@XCossale·16 Ara

What if spec-first API development is better for LLM workflows than code-first? LLMs degrade with large codebases. Spec-first keeps context small at every step: User requirements → [LLM] → OpenAPI spec (300 lines) Spec → [oapi-codegen] → Type-safe interfaces Interface + logic → [LLM] → Handlers (50 lines each) The spec compresses requirements into something reviewable. Tools like oapi-codegen handle boilerplate. Each handler gets focused context. LLMs removed what made spec-first tedious (implementing boilerplate), revealing what was always valuable: bounded context, mechanical consistency, independent regeneration. The spec becomes the compression layer that keeps the whole process tractable. Could generalize beyond APIs.

English

163

Jelajahi

@badlogicgames @mweinbach @revancedapp @norpadon @TheZachMueller @maximelabonne @vega_holdings @max_paperclips