AssistedEvolution

46.9K posts

AssistedEvolution

@AssistedEvolve

Thinking outside of the brane that contains the box

Australia Inscrit le Aralık 2015

92 Abonnements297 Abonnés

AssistedEvolution retweeté

Mehrdad Farajtabar@MFarajtabar·1d

Continual Learning remains one of the most challenging “holy grails” of AI. Most discussions focus on catastrophic forgetting: models lose what they previously learned. But there is another equally important failure mode: over long continual training, neural networks can also lose their plasticity, ie, their ability to learn new things is weakened over time. In our ICLR 2026 work with colleagues at @Apple and @ETH, we study this phenomenon, known as Loss of Plasticity (LoP), from a geometric perspective. We show that LoP can arise when gradient dynamics become trapped in invariant manifolds of parameter space. In particular, we analyze two types of traps: 🔴 Frozen units: units saturate, gradients vanish, and they become effectively silent to backpropagation. 🔵 Cloned units: units become redundant, receive matching forward and backward signals, and move together. For these structures, the gradient is tangent to the trap. Once standard GD/SGD enters these affine subspaces, it cannot leave them on its own. This means the dynamics can remain sticky even when the data distribution or task changes. What we find especially interesting is that these traps are not merely optimization bugs. The same feature-learning pressures that help networks learn useful representations for the current task can also push them toward states with less future adaptability. This raises a difficult open question for future work: are neural networks trained with SGD and cross-entropy loss fundamentally the right framework for continual learning? Please read the full paper for more details: arxiv.org/pdf/2510.00304

Amir Joudaki@AmirJoudaki

Neural nets don’t just forget. Sometimes, after long training, they lose the ability to learn at all. In our #ICLR2026 poster, we model Loss of Plasticity as gradient dynamics trapped in invariant manifolds: 🔴 frozen units, 🔵 cloned units. The video makes the traps visible.

English

324

44K

AssistedEvolution retweeté

China Science@ChinaScience·1d

A China-Singapore joint research team has proposed a new paradigm for the scalable fabrication of optical metamaterials, creating new avenues for multi-scale optical metamaterial research and micro-nano photonics applications and achieving synergistic optimization of optical properties and structural design. They independently developed a roll-to-roll additive nano-printing device -- a first-of-its-kind manufacturing solution that overcomes the long-standing trade-off among low cost, large-scale production, and personalized customization of optical metamaterials. This innovation enables large-scale controllable preparation and precise integration of multi-scale optical metamaterials, making production "as simple as printing newspapers." Published in the journal @Nature, the study is expected to demonstrate substantial application potential and industrial value in key areas including photonic information, anti-counterfeiting imaging, precision medical sensing and green photonic energy.

English

2.7K

AssistedEvolution retweeté

日本原子力研究開発機構（JAEA）@JAEA_japan·1d

【プレスリリース】「ガラスの「新たな非平衡状態」をレーザー照射により創出」今回、フェムト秒レーザーを用いた「光加圧」により、従来の物理的な高圧処理（平衡状態）では到達できない、シリカガラスの「新たな非平衡物質相」を創出することに成功しました。 jaea.go.jp/02/press2026/p…

日本語

2.5K

AssistedEvolution retweeté

Charles Rosenbauer@bzogrammer·1d

Going from 10nm to 7 to 5 to 3 to 2nm, gate pitch has dropped only from 54nm to 42nm. Transistors barely get smaller anymore, just a little denser. GAAFET turns multiple adjacent fins into a vertical stack. Backside power delivery frees up slightly more room for transistors. CFET, if they can pull it off, stacks NMOS on top of PMOS. We've picked all the low-hanging fruit, we're picking the high fruit now, and there's not much left that's ripe. Polymer-based photoresists need long chains of molecules in order to stick to the silicon. The chains simply aren't long enough below 5nm features. High-NA EUV has 6-8nm features, and TSMC can't even justify using that tech until 2031. The tech tree for high-end logic is ending. Maybe eventually a crazy technical breakthrough will let us keep going, but it'll either come from a different orchard, or from waiting a decade or two for some new fruit to ripen. Market differentiation will shift toward peripheral things like packaging. If we stay on 1-2nm for long enough, other companies will have time to catch up, equipment will come down in costs, and advanced nodes might see some market diversity again, as well as maybe some other creative peripheral tech.

bubble boi@bubbleboi

TSMC will lose its crown not because they didn’t keep up on high end logic but because they didn’t invest in their packaging offering enough. Today EMIB is far superior than even what TSMC is projecting to offer with CoPoS in 2030… My prediction is Feynmen will completely use EMIB the rumor of it being used just for an I/O die is wishful thinking. Even if you just estimate Intel does the I/O dies for Feynmen an extra 4.8B in revenue. If they get used for the compute dies which is my assumption the revenue becomes 28-66 billion a year from a range of (2.5 to 6M packages). This isn’t even including doing deals with TPU & other hyperscaler ASICs. Please guys get out of the permanent underclass this a trillion dollar company at worst, the fact that AMD still has a higher valuation is crazy.

English

332

38.4K

AssistedEvolution retweeté

Underfox@Underfox3·1d

In this paper is proposed FieldCore, a fully multiplexed photonic tensor core that jointly harnesses wavelength, radio-frequency, guided-mode, time and space dimensions, thereby enabling parallelism to scale multiplicatively within a single optical field. arxiv.org/pdf/2604.22660

English

3.6K

AssistedEvolution retweeté

🌿 lithos@lithos_graphein·1d

Kyocera releases their gen2 ceramic core substrate designed for low warpage in large-area packaging applications.

English

4.6K

AssistedEvolution retweeté

田中秀治 / Shuji Tanaka@mems6934·1d

X線の透過率は原子量・密度が小さい程、高くなる。当然、薄いものはＸ線を通しやすいので、原子量が小さく、薄くても壊れない高強度の材料がEUVペリクルに最適。原子番号4のベリリウムはアルミニウムより17倍、X線を通しやすく、鉄より1.5倍、強い。だから、Ｘ線管球やＸ線検出器の窓に使われている。

🐇@1p_semicon

田中先生の講義資料 EUVペリクルに用いるCNTはCanatu社製。 canatu.com/products/semic…

日本語

128

16.6K

AssistedEvolution retweeté

Jyotirmai Singh@SinghJyotirmai·1d

Forget Damascus steel we're on superconducting Damascus tantalum (Tantalum Damascene Coplanar Waveguide Resonators Fabricated Using 300 mm Scale Processes, arxiv.org/pdf/2604.22086)

English

1.7K

AssistedEvolution retweeté

hardmaru@hardmaru·1d

For the past few years, humans have been doing “prompt engineering” to coax the best performance out of different LLMs. In this work, we explored what happens if we train an AI to do that job instead. By training a Conductor model with RL, we found that it naturally learns to write highly effective, custom instructions for a whole pool of other models. It essentially learns to ‘manage’ them in natural language. What surprised me most was how it dynamically adapts. For simple factual questions, it just queries one model. But for hard coding problems, it autonomously spins up a whole pipeline of planners, coders, and verifiers. Really excited to see where this paradigm of “AI managing AI” goes next, especially as we start moving from single-agent chain-of-thought to multi-agent “chain-of-command”. Link to our #ICLR2026 paper: arxiv.org/abs/2512.04388 Along with our TRINITY paper which we announced earlier, this work also powers our new multi-agent system: Sakana Fugu (sakana.ai/fugu-beta) 🐡

Sakana AI@SakanaAILabs

Introducing our new work: “Learning to Orchestrate Agents in Natural Language with the Conductor” accepted at #ICLR2026 arxiv.org/abs/2512.04388 What if we trained an AI not to solve problems directly, but to act as a manager that delegates tasks to a diverse team of other AIs? To solve complex tasks, humans rarely work alone; we form teams, delegate, and communicate. Yet, multi-agent AI systems currently rely heavily on rigid, human-designed workflows or simple routers that just pick a single model. We wanted an AI that could dynamically build its own team. We trained a 7B Conductor model using Reinforcement Learning to orchestrate a pool of frontier models (including GPT-5, Gemini, Claude, and open-source models available during the period leading up to ICLR 2026). Instead of executing code, the Conductor outputs a collaborative workflow in natural language. For any given question, the Conductor specifies: 1/ Which agent to call 2/ What specific subtask to give them (acting as an expert prompt engineer) 3/ What previous messages they can see in their context window Through pure end-to-end reward maximization, amazing behaviors emerged. The Conductor learned to adapt to task difficulty: it 1-shots simple factual questions, but autonomously spins up complex planner-executor-verifier pipelines for hard coding problems. The results are very promising: The 7B Conductor surpasses the performance of every individual worker model in its pool, setting new records on LiveCodeBench (83.9%) and GPQA-Diamond (87.5%) at the time of publication. It also significantly outperforms expensive multi-agent baselines like Mixture-of-Agents at a fraction of the cost. One of our favorite features: Recursive Test-Time Scaling! By allowing the Conductor to select itself as a worker, it reads its own team's prior output, realizes if it failed, and spins up a corrective workflow on the fly. This opens a new axis for scaling compute during inference. This research proves that language models can become elite meta-prompt engineers, dynamically harnessing collective intelligence. Alongside our TRINITY research which we announced a few days earlier, this foundational research powers our new multi-agent system: Sakana Fugu! (sakana.ai/fugu-beta) 🐡 OpenReview: openreview.net/forum?id=U23A2… (ICLR 2026)

English

158

1.3K

156.4K

AssistedEvolution retweeté

t.toda@Trtd6Trtd·1d

arxiv.org/abs/2604.07569 LLMの学習プロセスを「覚える」ではなく「うまく忘れる」という視点で分析した研究らしい MP3の圧縮のアナロジーが面白かったよくよく考えたらMLやってると当たり前のことで、生の情報を予測に必要な形に圧縮して汎化していくことがAIの学習の本質に近いのかも

日本語

320

16.3K

AssistedEvolution retweeté

Grigory Sapunov@che_shr_cat·1d

1/ A 7B model just beat a 671B model at formal theorem proving. The secret is not more data, it is fixing the reward hacking loop in asymmetric self-play. Here is how Stanford researchers broke the RL scaling plateau. 🧵

English

510

33.3K

AssistedEvolution retweeté

Owen Brake@OwenBrakes·1d

The US pushed so far down the gyroscope tech tree, 1980s tech is still orders of magnitude better than modern export-controlled hardware. 1990 (Northrop - Secret): 0.0001°/hr 2024 (Lockheed - Public): 0.003°/hr 2025 (China - Public): 0.01°/hr

English

124

135.9K

AssistedEvolution retweeté

Sander Dieleman@sedielem·2d

Along with Categorical Flow Maps and Flow Map Language Models, we now have three separate papers heralding the triumphant return of continuous methods for language diffusion😶‍🌫️ Can you tell I'm excited?🫨 arxiv.org/abs/2602.12233 arxiv.org/abs/2602.16813 arxiv.org/abs/2604.09784

Michael Albergo@msalbergo

New paper! Presenting Discrete Flow Maps: paper: arxiv.org/abs/2604.09784 blog: malbergo.me/discrete-flow-… A laughable problem for me these days is that @nmboffi and I share a research brain, and we have had, time and again, a conversation that ends with “ha so I guess we’re writing the same paper.” Soon we will return to just doing it together :). Here we are doing it again with discrete flow maps and flow language models! A complete and thorough paper led by @PPotaptchik @json_yim @adhisarav @peholderrieth. We took a bit of time to post it to ensure we understood a few more things about the stability of the loss functions. Like @osclsd , @FEijkelboom, and @nmboffi , we think this could be a very helpful paradigm for thinking about fast inference and even better alignment! Here’s our version of the story, and I hope it makes clear how green field this research direction is — we provide a comprehensive picture of the KL losses you can write from the properties of the flow map, some nice geometric proofs about the mean denoiser and the simplex, and find that at this time, the ESD can actually be the most performant, with some caveats. Excited for everyone to work together and push this class of models to their limit!

English

307

29.2K

AssistedEvolution retweeté

Yuxuan Mu@YuxuanMu16173·1d

Can we build a standalone, modular, and reusable naturalness reward for training motor controllers? #SMP is a step toward that vision. Once SMP has been trained on a motion dataset, the priors can be reused to train new controllers to perform diverse tasks while adhering to the behaviors in the dataset, without original dataset or retraining. 🔥 Excited to share our latest work, SMP: Score-Matching Motion Priors, accepted to @siggraph Webpage: yxmu.foo/smp-page Code: github.com/xbpeng/MimicKit Paper: yxmu.foo/smp-page/asset… Video: youtu.be/jBA2tWk6vzU

YouTube

English

293

45.2K

AssistedEvolution retweeté

Tom Moore@mooreth42·1d

Just finished uploading 29 basic structural diamondback parts to #SAMSON Connect. If you build anything with them, please let me know so I can be excited for us. samson-connect.net/TomMoore #molecularmodeling

English

1.1K

AssistedEvolution retweeté

OptimaLab@optimalab1·1d

During neural network training, the loss landscape gets sharper until it hits a ceiling. GD pins right at the ceiling. SGD settles below it — and the gap grows as you shrink the batch. Why? We now have the answer. arxiv.org/abs/2604.21016 🧵 Blog: akyrillidis.github.io/aiowls/stochas…

English

392

31.7K

AssistedEvolution retweeté

Physical Review Materials@PhysRevMater·1d

New #EdSugg: researchers from @ucsantabarbara present a theoretical study of radiative recombination in hexagonal germanium, demonstrating that strain engineering can drive a pseudodirect-to-direct-gap transition to significantly enhance optical emission. doi.org/10.1103/4m4m-8…

English

478

AssistedEvolution retweeté

Marcel Butucea@marcel_butucea·1d

the part that blew my mind: they route each query through a tiny transformer first, run a lightweight self‑test, and only bump it to a big MFM if needed – cuts compute ~30% with <1% loss in accuracy 🤯 #ML #Hardware arxiv.org/abs/2604.21952

English

AssistedEvolution retweeté

Bryan Kelly@BryanKeIIy·1d

Flat-top vortex pumping enables high-contrast amplification of femtosecond vortex pulse🛸

English

357

AssistedEvolution retweeté

Adam Taylor@ATaylorFPGA·1d

Before we can use a FPGA, we need to design it correctly on to its board. Tomas Chester is one of the best there is at this, his latest Horizon article looking at Sparse vs Dense breakout is one you should definitely read. Either as electronic engineer or FPGA designer. issuu.com/fpga-horizons/…

English

238

14K

Découvrir

@Apple @ETH @Nature @siggraph @ucsantabarbara @elonmusk @BarackObama @taylorswift13