The Mint

1.6K posts

The Mint

@themintsv

machine intelligence, learning, technology, silicon valley

Silicon Valley, CA, USA Katılım Kasım 2019

48 Takip Edilen69 Takipçiler

The Mint@themintsv·4h

@evelovesolive You should also invest a little bit in your website and graphics; they look like from 20 years ago.

English

Eve Bodnia@evelovesolive·13h

We are hiring in the US and Europe! Join our team with world class AI researchers, Fields medalist, Turing Award Winner and ICPC champs job-boards.greenhouse.io/logicalintelli…

English

4.2K

The Mint@themintsv·1d

@andrewgwils It is actually a crabapple. Cherry trees are significantly different.

English

Andrew Gordon Wilson@andrewgwils·1d

@themintsv If you zoom in there actually is a cherry in the centre!

English

Andrew Gordon Wilson@andrewgwils·1d

Stunning cherry blossoms in Montreal.

English

1.4K

The Mint@themintsv·15 May

@fchollet Was curious about PyTorch, asked Grok: 85.4 million monthly downloads. Posting, in case someone is interested.

English

François Chollet@fchollet·15 May

The Keras package recently crossed 21M monthly downloads on PyPI, an all-time high (the daily ATH is around 900k). I still remember when it first crossed 10M monthly downloads about 5 years ago and I thought it couldn't possibly go any higher...

English

192

30.7K

The Mint@themintsv·14 May

@logic_int Not sure if the big, closed industry labs are not doing anything in this space. They are probably doing a lot, but not disclosing.

English

159

Logical Intelligence@logic_int·14 May

Aleph, our fully autonomous AI agent system for formal verification, aced all major theorem proving benchmarks including PutnamBench, VeriSoftBench, and Verina

English

132

25.3K

The Mint@themintsv·14 May

@rasbt I would not care about this ratio as a user. I would care about the computational efficiency (latency, memory usage) and intelligence.

English

110

Sebastian Raschka@rasbt·14 May

Meta observation: DeepSeek is still king of the active-parameter ratio

English

330

57.4K

The Mint@themintsv·4 May

@andrewgwils Human connection (both peers and teachers/profs/TAs) is definitely valuable, but LLMs are also extremely valuable for learning anything. We can get best of both worlds.

English

318

Andrew Gordon Wilson@andrewgwils·4 May

The most value I got from college was through my peer group: brainstorming, learning from them, and just human connection. That's why I knew Moocs would never threaten traditional education. But now what happens when students just sit alone, asking LLMs to do their homework?

English

118

10K

The Mint@themintsv·27 Nis

Energy-Based Transformers explained | How EBTs and EBMs work youtube.com/watch?v=18Fn2m…

YouTube

English

113

The Mint@themintsv·27 Nis

Energy-Based Transformers are Scalable Learners and Thinkers EBTs: a new class of EBMs to assign an energy value to every input and candidate-prediction pair, enabling predictions through gradient descent-based energy minimization until convergence. arxiv.org/abs/2507.02092

English

The Mint@themintsv·25 Nis

@akseljoonas With all these accomplishments, promote ml-intern to HF CEO.

English

160

Aksel@akseljoonas·24 Nis

For the last 72 hours since ml-intern launched we have had over 500+ autonomous AI research projects running on the Space at all times. Some insane ones I saw: 1. A new AI paradigm from scratch — trying to replace transformers with a reasoning architecture based on energy minimization, binary sparse address tables and circular convolution binding. No GPU, no gradients, no training data — pure bitwise operations. Years of research done in 2 days. huggingface.co/Harry00/MLE-Mo… 2. Someone took LoopLM (ByteDance's recurrent depth transformer with shared layers and infinite depth via looping) and crossed it with BitNet b1.58 (ternary 1.58-bit weights). The result: a model that's both infinitely deep AND uses almost no memory per parameter. 3. Designing a new attention mechanism modeled on the thalamo-cortical circuit in the human brain. Pulling from 2025/2026 research out of MIT, Harvard, and UF. The thalamus gates what information reaches the cortex. They're building a learnable gate that mimics this for transformer attention heads, combined with EEG datasets and a reinforcement learning loop. huggingface.co/spaces/daniel8… The use cases people bring are cooler and more impressive than anything we imagined when we built this.

Aksel@akseljoonas

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English

762

100.3K

The Mint@themintsv·20 Nis

Towards Generalizable and Efficient Large-Scale Generative Recommenders Approach to scaling generative recommendation models from O(1M) to O(1B) parameters, achieving substantial improvements on Netflix recommendation tasks netflixtechblog.medium.com/towards-genera…

English

The Mint@themintsv·14 Nis

@ClementDelangue Why not use the paper sources in latex or pdfs, instead of running costly OCR?

English

134

clem 🤗@ClementDelangue·13 Nis

We just OCR'd 27,000 arxiv papers into Markdown using an open 5B model, 16 parallel HF Jobs on L40S GPUs, and a mounted bucket. Total cost: $850 Total time: ~29 hours Jobs that crashed: 0 This now powers "Chat with your paper" on hf.co/papers

English

248

2.3K

174.9K

The Mint@themintsv·11 Nis

@andrewgwils +100

111

Andrew Gordon Wilson@andrewgwils·11 Nis

Not sure about the Machiavelli part but I definitely empathize!

Cree@creebeauvoir

about every 2 weeks i fantasize about quitting my job, escaping to nature, and starting a maybe 5 year long writing project that is similar to Machiavelli's History of Florence

English

4.5K

The Mint@themintsv·10 Nis

@AndrewDai @yinfeiy @ElorianAI @JeffDean Congrats! Looks like, after the centralization of AI efforts in big tech, there is no longer enough space for all the talent and even top people are being pushed out to start their own companies.. So many start-ups recently. I wonder if this is explicitly encouraged.

English

Andrew M. Dai@AndrewDai·9 Nis

After almost 12 years in Brain/DeepMind, I’ve finally decided to take the leap. My cofounders: @yinfeiy, Seth and I have kicked-off @ElorianAI. The first multimodal reasoning lab founded and led by former LLM pretraining, data and multimodal leads. youtu.be/YlvfNpOMeOY?si… (1/n)

YouTube

English

837

373.4K

The Mint@themintsv·10 Nis

@karpathy @kepano It is 2026, people are freaked out about AGI, yet we still do not even have proper portable document formats...

English

Andrej Karpathy@karpathy·9 Nis

@kepano I just tried it this morning on the 245-page Mythos pdf and it failed badly and the outputs were all mangled. Converting pdfs is really hard, I think it has to probably be a Skill not a program, for a SOTA LLM for it to work properly.

English

170

1.7K

276.9K

kepano@kepano·9 Nis

I wrote about Microsoft's Markitdown back in 2024, but it's grown into a big messy project now :/ It would be more valuable if Microsoft provided high-quality official libraries for converting their proprietary formats to Markdown (.docx, .xlsx, .pptx, OneNote, etc). For now Obsidian's Markdown conversion options are: 1. Obsidian Web Clipper for converting URLs 2. Obsidian Importer for converting from apps like Notion, Apple Notes, Google Keep, Microsoft OneNote, Evernote, etc

Vaishnavi@_vmlops

MICROSOFT BUILT A TOOL THAT CONVERTS LITERALLY ANYTHING INTO CLEAN MARKDOWN FOR YOUR LLM pdfs. word docs. excel. powerpoint. audio. youtube urls one pip install and your AI pipeline stops choking on raw files forever no custom parsers. no broken layouts. no garbled text. just clean, structured markdown your LLM can actually read github.com/microsoft/mark…

English

1.2K

348.9K

The Mint@themintsv·9 Nis

@andrewgwils Explained by Grok: x.com/i/grok/share/f…

English

1.2K

Andrew Gordon Wilson@andrewgwils·9 Nis

Without Einstein's general relativity from 1915, your GPS would drift about 10 km per day, and you'd have no idea why.

English

847

156.7K

The Mint@themintsv·5 Nis

@andrewgwils Happy Easter, to all who celebrate!

English

Andrew Gordon Wilson@andrewgwils·5 Nis

Happy Easter! youtu.be/-Tbv6ANMIrc?si…

YouTube

English

2.1K

The Mint@themintsv·4 Nis

@andrewgwils @samuel_stanton_ @CoefficientBio @BigHatBio You should demand some donations from your graduates to fund your lab! 400M is huge!

English

456

Andrew Gordon Wilson@andrewgwils·4 Nis

Proud of my PhD alum @samuel_stanton_ & colleagues for building @CoefficientBio! I remember first diving into compbio 6 years ago with @BigHatBio. Sam and I had been working on BayesOpt and active learning, a perfect foundation for sequence design. It was an incredible journey!

TechCrunch@TechCrunch

Anthropic buys biotech startup Coefficient Bio in $400M deal: reports techcrunch.com/2026/04/03/ant…

English

111

12.5K

The Mint@themintsv·29 Mar

Interesting blog posts about AI, by Prof. Ryoma Sato data-processing.club

English

The Mint@themintsv·27 Mar

@ClementDelangue @nickfrosst @huggingface @cohere Sadly, those drinks will ruin your health...

English

306

clem 🤗@ClementDelangue·27 Mar

Got to meet @nickfrosst in Miami today to celebrate their awesome release of an open-source Apache 2.0 Transcribe model that could be a whisper killer and already trending on @huggingface! @cohere deserves much more visibility in the community as one of the leaders of North American open-source!

English

184

37K

Keşfet

@evelovesolive @andrewgwils @fchollet @logic_int @rasbt @akseljoonas @ClementDelangue @elonmusk