
500 Global
24.5K posts

500 Global
@500GlobalVC
We invest in the world's potential. Momentum builds momentum. https://t.co/7mxfMs3E3d



How do we make LLMs faster and lighter? Don’t force the GPU to adapt to sparsity. Reshape the sparsity to fit the GPU! ⚡️ Excited to share our new #ICML2026 paper in collaboration with @NVIDIA: "Sparser, Faster, Lighter Transformer Language Models". This work introduces new open-source GPU kernels and data formats for faster inference and training of sparse transformer language models: Paper: arxiv.org/abs/2603.23198 Blog: pub.sakana.ai/sparser-faster… Code: github.com/SakanaAI/spars… While LLMs are undoubtedly powerful, they are increasingly expensive to train and deploy, with a large part of this cost coming from their feedforward layers. Yet, an interesting phenomenon occurs inside these layers: For any given token, only a small fraction of the hidden activations actually matter. The rest approximate zero, wasting computation. With ReLU and very mild L1 regularization, this sparsity can exceed 95% with little to no impact on downstream performance. So, can we leverage this sparsity to make LLMs faster? The challenge is hardware. Modern GPUs are optimized for dense matrix multiplications. Traditional sparse formats introduce irregular memory access and overheads that cancel out their theoretical savings for GEMM operations. Our contribution is twofold: 1/ We introduce TwELL (Tile-wise ELLPACK), a new sparse packing format designed to integrate directly in the same optimized tiled matmul kernels without disrupting execution. 2/ We develop custom CUDA kernels that fuse multiple sparse matmuls to maximize throughput and compress TwELL to a hybrid representation that minimizes activation sizes. We used our kernels to train and benchmark sparse LLMs at billion-parameter scales, demonstrating >20% speedups and even higher savings in peak memory and energy. This work will be presented at #ICML2026. Please check out our blog and technical paper for a deep dive!


1/12 Introducing MIMIC: a SOTA foundation model trained natively across DNA, RNA and proteins. MIMIC is multimodal and generative: it can use structure, regulation, evolution, and experimental context to infer missing biology or design new sequences.






The US government is looking to Singapore for deep tech. @ChannelNewsAsia featured us as one of the companies they're working with tapping capabilities in space & defence that don't exist in many places in the world. That's why we're opening in Austin. 🇺🇸

DeepInfra × Hugging Face DeepInfra is live on @HuggingFace Inference Providers. Run DeepSeek V4, Kimi-K2.6, GLM-5.1 and 100+ more open models straight from the Hub — same OpenAI-compatible API, same low per-token pricing, no markup. Just add :deepinfra to the model name.

@MassiveBio is collaborating with @OpenAI to expand global access to clinical trials. 🌍 Through our collaboration with @OpenAI, we're enabling real-time trial parameterization and free pre-screening services in high-population regions, ensuring that life-changing opportunities are available to patients everywhere, not just those near major research centers. This is what AI-driven precision oncology looks like in practice: breaking down the barriers that have long kept underserved patients from accessing the trials that could change their lives. Read the full announcement → massivebio.com/massive-bio-ex… #ClinicalTrials #AIinHealthcare #PrecisionOncology #HealthEquity #CancerResearch #OpenAI #DigitalHealth

Our very own co-founder @Goopt was just featured on @BBC's The Interview with @zsk. 🎙️ He talks AI, objectivity, and how Otter evolved from a transcription app into a Conversational Knowledge Engine that captures your conversations and makes them usable across your tools and across the enterprise. 🎧 Listen on BBC now: bbc.com/audio/play/w3c…




ICYMI, Seer Agent is live 🚀 Debug production issues in natural language. No predefined issue needed — just describe what's wrong and it traverses your full telemetry graph to find the root cause. Check out this new article on @thenewstack by @fredericl feat. @indragie 👉 thenewstack.io/sentrys-seer-a…

AI is concentrating, in a few countries, a few companies, a few chips. Open-source AI is the only real check on that. And it only works if the infrastructure to run it actually exists. Today, we're announcing our Series A, $20 M, to make this a reality.

We’re launching the beta for our new commercial AI product: Sakana Fugu 🐡, a multi-agent orchestration system! Blog: sakana.ai/fugu-beta Fugu hits SOTA on SWE-Pro, GPQA-D, and ALE-Bench, and has been our internal secret weapon. It dynamically coordinates frontier models, autonomously selecting the optimal agent combinations and roles for each task. Available as an OpenAI-compatible API, you can seamlessly integrate Fugu into your existing workflows with minimal changes. 🐟 Fugu Mini: High-speed orchestration optimized for latency 🐡 Fugu Ultra: Full model pool utilization for deep, complex reasoning Apply for the beta test here: forms.gle/BtKkhc2CfLKk1d…




