tensor
6.5K posts

Tweet épinglé

Excited to announce I'm joining @LumaLabsAI!
Will be joining the pretraining team
English

@SwayStar123 Oh, makes sense.
thanks. Adversarial methods are usually said to be expensive so I was wondering
English

@tensor_kelechi i dont know how much compute was used for this models pretraining but i assume on the order of 1%. You can probably use a lora for this too
English

Qwen@Alibaba_Qwen
🦥 Qwen3-Coder-Flash: Qwen3-Coder-30B-A3B-Instruct 💚 Just lightning-fast, accurate code generation. ✅ Native 256K context (supports up to 1M tokens with YaRN) ✅ Optimized for platforms like Qwen Code, Cline, Roo Code, Kilo Code, etc. ✅ Seamless function calling & agent workflows 💬 Chat: chat.qwen.ai 🤗 Hugging Face: hf.co/Qwen/Qwen3-Cod… 🤖 ModelScope: modelscope.cn/models/Qwen/Qw… 🔧 Qwen Code: github.com/QwenLM/qwen-co…
QME

*cracked interns at Apple
Vaibhav (VB) Srivastav@reach_vb
Apple dropping diffusion based Coding LLMs on Hugging Face was not on my bingo
English

@test_tm7873 @kalomaze last year, in their foundation model paper, they reported using TPU-v4s (arxiv.org/abs/2407.21075…)
but for this project (Diffucoder) they used H100s


English

@tensor_kelechi I’m almost a year late lol, but it’s from this video youtu.be/N2bXEUSAiTI?si…

YouTube
English

@tensor_kelechi Oh cool, how's your ml journey. I'm slowing down for now, doing more backend stuff
English

last presentable Jax thing I did was making a checkpointing library for Flax(nnx API) models.
still needs more testing and improvement. I created it cus I wanted to make something simpler than the rest
github.com/kelechi-c/nnx_…
English
tensor retweeté

excited to share that I have joined @audiogenai (as a research intern) to build the next generation of audio/music generation models 🚀.
It's been extremely cool working with the team so far 🫡
English









