The Mint

1.6K posts

The Mint

The Mint

@themintsv

machine intelligence, learning, technology, silicon valley

Silicon Valley, CA, USA Katılım Kasım 2019
48 Takip Edilen69 Takipçiler
The Mint
The Mint@themintsv·
@evelovesolive You should also invest a little bit in your website and graphics; they look like from 20 years ago.
English
0
0
0
20
The Mint
The Mint@themintsv·
@andrewgwils It is actually a crabapple. Cherry trees are significantly different.
The Mint tweet media
English
0
0
0
38
The Mint
The Mint@themintsv·
@fchollet Was curious about PyTorch, asked Grok: 85.4 million monthly downloads. Posting, in case someone is interested.
English
0
0
1
74
François Chollet
François Chollet@fchollet·
The Keras package recently crossed 21M monthly downloads on PyPI, an all-time high (the daily ATH is around 900k). I still remember when it first crossed 10M monthly downloads about 5 years ago and I thought it couldn't possibly go any higher...
English
17
8
192
30.7K
The Mint
The Mint@themintsv·
@logic_int Not sure if the big, closed industry labs are not doing anything in this space. They are probably doing a lot, but not disclosing.
English
0
0
1
159
Logical Intelligence
Logical Intelligence@logic_int·
Aleph, our fully autonomous AI agent system for formal verification, aced all major theorem proving benchmarks including PutnamBench, VeriSoftBench, and Verina
Logical Intelligence tweet media
English
13
32
132
25.3K
The Mint
The Mint@themintsv·
@rasbt I would not care about this ratio as a user. I would care about the computational efficiency (latency, memory usage) and intelligence.
English
1
0
0
110
Sebastian Raschka
Sebastian Raschka@rasbt·
Meta observation: DeepSeek is still king of the active-parameter ratio
Sebastian Raschka tweet media
English
20
37
330
57.4K
The Mint
The Mint@themintsv·
@andrewgwils Human connection (both peers and teachers/profs/TAs) is definitely valuable, but LLMs are also extremely valuable for learning anything. We can get best of both worlds.
English
0
0
1
318
Andrew Gordon Wilson
Andrew Gordon Wilson@andrewgwils·
The most value I got from college was through my peer group: brainstorming, learning from them, and just human connection. That's why I knew Moocs would never threaten traditional education. But now what happens when students just sit alone, asking LLMs to do their homework?
English
10
1
118
10K
The Mint
The Mint@themintsv·
Energy-Based Transformers are Scalable Learners and Thinkers EBTs: a new class of EBMs to assign an energy value to every input and candidate-prediction pair, enabling predictions through gradient descent-based energy minimization until convergence. arxiv.org/abs/2507.02092
English
1
0
0
61
The Mint
The Mint@themintsv·
@akseljoonas With all these accomplishments, promote ml-intern to HF CEO.
English
0
0
0
160
Aksel
Aksel@akseljoonas·
For the last 72 hours since ml-intern launched we have had over 500+ autonomous AI research projects running on the Space at all times. Some insane ones I saw: 1. A new AI paradigm from scratch — trying to replace transformers with a reasoning architecture based on energy minimization, binary sparse address tables and circular convolution binding. No GPU, no gradients, no training data — pure bitwise operations. Years of research done in 2 days. huggingface.co/Harry00/MLE-Mo… 2. Someone took LoopLM (ByteDance's recurrent depth transformer with shared layers and infinite depth via looping) and crossed it with BitNet b1.58 (ternary 1.58-bit weights). The result: a model that's both infinitely deep AND uses almost no memory per parameter. 3. Designing a new attention mechanism modeled on the thalamo-cortical circuit in the human brain. Pulling from 2025/2026 research out of MIT, Harvard, and UF. The thalamus gates what information reaches the cortex. They're building a learnable gate that mimics this for transformer attention heads, combined with EEG datasets and a reinforcement learning loop. huggingface.co/spaces/daniel8… The use cases people bring are cooler and more impressive than anything we imagined when we built this.
Aksel@akseljoonas

Introducing ml-intern, the agent that just automated the post-training team @huggingface It's an open-source implementation of the real research loop that our ML researchers do every day. You give it a prompt, it researches papers, goes through citations, implements ideas in GPU sandboxes, iterates and builds deeply research-backed models for any use case. All built on the Hugging Face ecosystem. It can pull off crazy things: We made it train the best model for scientific reasoning. It went through citations from the official benchmark paper. Found OpenScience and NemoTron-CrossThink, added 7 difficulty-filtered dataset variants from ARC/SciQ/MMLU, and ran 12 SFT runs on Qwen3-1.7B. This pushed the score 10% → 32% on GPQA in under 10h. Claude Code's best: 22.99%. In healthcare settings it inspected available datasets, concluded they were too low quality, and wrote a script to generate 1100 synthetic data points from scratch for emergencies, hedging, multilingual etc. Then upsampled 50x for training. Beat Codex on HealthBench by 60%. For competitive mathematics, it wrote a full GRPO script, launched training with A100 GPUs on hf.co/spaces, watched rewards claim and then collapse, and ran ablations until it succeeded. All fully backed by papers, autonomously. How it works? ml-intern makes full use of the HF ecosystem: - finds papers on arxiv and hf.co/papers, reads them fully, walks citation graphs, pulls datasets referenced in methodology sections and on hf.co/datasets - browses the Hub, reads recent docs, inspects datasets and reformats them before training so it doesn't waste GPU hours on bad data - launches training jobs on HF Jobs if no local GPUs are available, monitors runs, reads its own eval outputs, diagnoses failures, retrains ml-intern deeply embodies how researchers work and think. It knows how data should look like and what good models feel like. Releasing it today as a CLI and a web app you can use from your phone/desktop. CLI: github.com/huggingface/ml… Web + mobile: huggingface.co/spaces/smolage… And the best part? We also provisioned 1k$ GPU resources and Anthropic credits for the quickest among you to use.

English
26
88
762
100.3K
The Mint
The Mint@themintsv·
Towards Generalizable and Efficient Large-Scale Generative Recommenders Approach to scaling generative recommendation models from O(1M) to O(1B) parameters, achieving substantial improvements on Netflix recommendation tasks netflixtechblog.medium.com/towards-genera…
English
0
0
0
42
The Mint
The Mint@themintsv·
@ClementDelangue Why not use the paper sources in latex or pdfs, instead of running costly OCR?
English
0
0
1
134
clem 🤗
clem 🤗@ClementDelangue·
We just OCR'd 27,000 arxiv papers into Markdown using an open 5B model, 16 parallel HF Jobs on L40S GPUs, and a mounted bucket. Total cost: $850 Total time: ~29 hours Jobs that crashed: 0 This now powers "Chat with your paper" on hf.co/papers
clem 🤗 tweet media
English
89
248
2.3K
174.9K
The Mint
The Mint@themintsv·
@AndrewDai @yinfeiy @ElorianAI @JeffDean Congrats! Looks like, after the centralization of AI efforts in big tech, there is no longer enough space for all the talent and even top people are being pushed out to start their own companies.. So many start-ups recently. I wonder if this is explicitly encouraged.
English
0
0
0
38
Andrew M. Dai
Andrew M. Dai@AndrewDai·
After almost 12 years in Brain/DeepMind, I’ve finally decided to take the leap. My cofounders: @yinfeiy, Seth and I have kicked-off @ElorianAI. The first multimodal reasoning lab founded and led by former LLM pretraining, data and multimodal leads. youtu.be/YlvfNpOMeOY?si… (1/n)
YouTube video
YouTube
English
83
79
837
373.4K
The Mint
The Mint@themintsv·
@karpathy @kepano It is 2026, people are freaked out about AGI, yet we still do not even have proper portable document formats...
English
0
0
0
62
Andrej Karpathy
Andrej Karpathy@karpathy·
@kepano I just tried it this morning on the 245-page Mythos pdf and it failed badly and the outputs were all mangled. Converting pdfs is really hard, I think it has to probably be a Skill not a program, for a SOTA LLM for it to work properly.
English
170
37
1.7K
276.9K
kepano
kepano@kepano·
I wrote about Microsoft's Markitdown back in 2024, but it's grown into a big messy project now :/ It would be more valuable if Microsoft provided high-quality official libraries for converting their proprietary formats to Markdown (.docx, .xlsx, .pptx, OneNote, etc). For now Obsidian's Markdown conversion options are: 1. Obsidian Web Clipper for converting URLs 2. Obsidian Importer for converting from apps like Notion, Apple Notes, Google Keep, Microsoft OneNote, Evernote, etc
Vaishnavi@_vmlops

MICROSOFT BUILT A TOOL THAT CONVERTS LITERALLY ANYTHING INTO CLEAN MARKDOWN FOR YOUR LLM pdfs. word docs. excel. powerpoint. audio. youtube urls one pip install and your AI pipeline stops choking on raw files forever no custom parsers. no broken layouts. no garbled text. just clean, structured markdown your LLM can actually read github.com/microsoft/mark…

English
42
37
1.2K
348.9K
Andrew Gordon Wilson
Andrew Gordon Wilson@andrewgwils·
Without Einstein's general relativity from 1915, your GPS would drift about 10 km per day, and you'd have no idea why.
English
33
10
847
156.7K
clem 🤗
clem 🤗@ClementDelangue·
Got to meet @nickfrosst in Miami today to celebrate their awesome release of an open-source Apache 2.0 Transcribe model that could be a whisper killer and already trending on @huggingface! @cohere deserves much more visibility in the community as one of the leaders of North American open-source!
clem 🤗 tweet media
English
15
8
184
37K