erogol

8.6K posts

erogol

@erogol

Doing ML | Web - https://t.co/yxKAwSSkgR | Substack - https://t.co/W9Qg4M3AZg

Katılım Ekim 2008

581 Takip Edilen1.5K Takipçiler

Sabitlenmiş Tweet

erogol@erogol·16 Tem

XTTS is still being downloaded almost 5m times every month and 2.1m only on HF. It is greater than many recent hyped models. Hope people use it well for that to worth to my burnout that I’m still recovering from. Coqui has been one of the most successful broke startups

English

110

4.6K

erogol@erogol·3h

inspired by the post, created an open-source version for my agents github.com/erogol/ngi

Cursor@cursor_ai

Cursor can now search millions of files and find results in milliseconds. This dramatically speeds up how fast agents complete tasks. We're sharing how we built Instant Grep, including the algorithms and tradeoffs behind the design.

English

117

erogol@erogol·1d

@jeremyphoward I found Gemini more tamed and more like an assistant. However it Is agreeing with everything

English

198

Jeremy Howard@jeremyphoward·1d

Opus & Sonnet 4.6 haven't been a great hit for most of my work, or our customers, since (as warned in their tech report) they're over-enthusiastic about agentically taking over, rather than letting the human lead. Any suggestions for competent models that are patient followers?

English

380

80.4K

erogol@erogol·2d

i was vibing on a few LLM projects with zero visibility into what was actually happening. silent retry bugs burning tokens, wrong API keys, traffic to dead endpoints. couldn't debug any of it. so i created this github.com/erogol/toklog

English

206

erogol retweetledi

Hugging Models@HuggingModels·4d

Meet XTTS-v2: a text-to-speech model that's changing how we create voice content. It generates natural, expressive speech from text, supporting multiple languages and voices. With over 6.7M downloads, it's clearly a community favorite!

English

6.5K

erogol@erogol·3d

Working on my personal ai I shared a screenshot of a workout, it automatically added it to my Garmin as a workout plan. Garmin people would know what it means That’s agi to me :)

English

163

erogol@erogol·6d

Machine Learns 66 is out This issue is heavy on model architecture + training tricks: • Nemotron v3 • Mamba-3 • Attention Residuals • LM head as a gradient bottleneck • Fish Audio S2 • speculative decoding models Full issue 👇 erogol.substack.com/p/machine-lear…

English

300

erogol@erogol·10 Mar

Happy to see something really architecturally new in the TTS space. Solid work!

Hume AI@hume_ai

Today we're releasing our first open source TTS model, TADA! TADA (Text Audio Dual Alignment) is a speech-language model that generates text and audio in one synchronized stream to reduce token-level hallucinations and improve latency. This means: → Zero content hallucinations across 1,000+ test samples → 5x faster than similar-grade LLM-based TTS → Fits much longer audio: 2,048 tokens cover ~700 seconds with TADA vs. ~70 seconds in conventional systems → Free transcript alongside audio with no added latency

English

726

erogol@erogol·4 Mar

Machine Learns #65 🤖📬 Steerling-8B (causal diffusion+interp), LK Losses for speculative decoding, MLRA+Untied Ulysses for KV/memory, SD‑MoE on “fake experts”, plus DashengTokenizer + FlexiCodec/MSR‑Codec + MeanVoiceFlow. erogol.substack.com/p/machine-lear…

English

262

erogol retweetledi

Sakana AI@SakanaAILabs·27 Şub

We’re excited to introduce Doc-to-LoRA and Text-to-LoRA, two related research exploring how to make LLM customization faster and more accessible. pub.sakana.ai/doc-to-lora/ By training a Hypernetwork to generate LoRA adapters on the fly, these methods allow models to instantly internalize new information or adapt to new tasks. Biological systems naturally rely on two key cognitive abilities: durable long-term memory to store facts, and rapid adaptation to handle new tasks given limited sensory cues. While modern LLMs are highly capable, they still lack this flexibility. Traditionally, adding long-term memory or adapting an LLM to a specific downstream task requires an expensive and time-consuming model update, such as fine-tuning or context distillation, or relies on memory-intensive long prompts. To bypass these limitations, our work focuses on the concept of cost amortization. We pay the meta-training cost once to train a hypernetwork capable of producing tasks or document specific LoRAs on demand. This turns what used to be a heavy engineering pipeline into a single, inexpensive forward pass. Instead of performing per-task optimization, the hypernetwork meta-learns update rules to instantly modify an LLM given a new task description or a long document. In our experiments, Text-to-LoRA successfully specializes models to unseen tasks using just a natural language description. Building on this, Doc-to-LoRA is able to internalize factual documents. On a needle-in-a-haystack task, Doc-to-LoRA achieves near-perfect accuracy on instances five times longer than the base model's context window. It can even generalize to transfer visual information from a vision-language model into a text-only LLM, allowing it to classify images purely through internalized weights. Importantly, both methods run with sub-second latency, enabling rapid experimentation while avoiding the overhead of traditional model updates. This approach is a step towards lowering the technical barriers of model customization, allowing end-users to specialize foundation models via simple text inputs. We have released our code and papers for the community to explore. Doc-to-LoRA Paper: arxiv.org/abs/2602.15902 Code: github.com/SakanaAI/Doc-t… Text-to-LoRA Paper: arxiv.org/abs/2506.06105 Code: github.com/SakanaAI/Text-…

GIF

English

354

2.2K

596.8K

erogol@erogol·25 Şub

Shoutout to both forks keeping Coqui TTS alive: idiap/coqui-ai-tts: github.com/idiap/coqui-ai… AllTalk TTS: github.com/erew123/alltal… The mission continues...

English

564

erogol@erogol·25 Şub

For AI thinking tokens Chinese and Japanese could be ~50% more token-efficient than English as they have higher information density. Same info, fewer tokens, less money&energy. It might even give a big edge for those companies in the long term.

English

206

erogol retweetledi

Larry Dial@classiclarryd·24 Şub

New NanoGPT Speedrun WR at 89.1 (-0.7s) from @sisovicm , with a technique called partitioned hyperconnections. The learned weights reveal that the final attn modules prefer to ignore the prediction vectors generated by the final MLPs, and instead query representations from slightly earlier layers. github.com/KellerJordan/m…

English

141

18.2K

erogol@erogol·24 Şub

isn't "developing an ai agent framework" an oxymoron?

English

110

erogol@erogol·23 Şub

SqueezeNet made me win my first customer. Thanks mate!

Forrest Iandola@fiandola

Today marks 10 years since we released SqueezeNet. It matched AlexNet accuracy with 𝟱𝟬× 𝗳𝗲𝘄𝗲𝗿 𝗽𝗮𝗿𝗮𝗺𝗲𝘁𝗲𝗿𝘀 and could be compressed to 𝗵𝗮𝗹𝗳 𝗮 𝗺𝗲𝗴𝗮𝗯𝘆𝘁𝗲. github.com/forresti/Squee…

English

244

erogol@erogol·23 Şub

big bois being big bois news.ycombinator.com/item?id=471158…

English

165

erogol@erogol·21 Şub

However 3.1pro is the first model that can really come up with creative solutions even if it conflicts with you.

erogol@erogol

No bad intentions but Gemini 3.1 pro is quite dumb as an agent. Confusing thing in the context a lot

English

180

erogol@erogol·21 Şub

No bad intentions but Gemini 3.1 pro is quite dumb as an agent. Confusing thing in the context a lot

English

386

erogol@erogol·16 Şub

Here is a great tip, keep your openclaw memories as emails. Works like magic 🪄

English

173

erogol@erogol·10 Şub

@michalwols nice! I'm also vibe coding something atm. pls ping me when it is out. I dont use WB but happy to check

English

Michal Wolski@michalwols·10 Şub

@erogol I hacked together a multimodal logging tool that writes to ducklake or lance / parquet files and supports having custom monitors for things like outlier or drift detection. will probably open source it soon with a wandb / datadog like ui on top

English

erogol@erogol·10 Şub

there is a space for Tensorboard but with AI that monitors training runs and detects irregularities, possible improvements and inform you about the overall health of the training run. let me know if you do it. I'm the first customer.

English

185

Keşfet

@jeremyphoward @sisovicm @michalwols @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates