ℜ𝗒an (sigmoid.social/@SynapticSage)

328 posts

ℜ𝗒an (sigmoid.social/@SynapticSage)

@SynapticSage

Neural network on a quest for cat videos, brain-inspired ML, CS, and math. Probably in that order. Ph.D. Comp/Sys Neuro, HPC-CTX research; ML/AI engineer

Waltham, MA เข้าร่วม Haziran 2012

707 กำลังติดตาม139 ผู้ติดตาม

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Toshitake Asabuki@TAsabuki·23h

How do recurrent networks decide what to spontaneously replay? Check out our new preprint! w/@ClopathLab biorxiv.org/content/10.648…

English

168

11.2K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Anthropic@AnthropicAI·15 Nis

Research we co-authored on subliminal learning—how LLMs can pass on traits like preferences or misalignment through hidden signals in data—was published today in @Nature. Read the paper: nature.com/articles/s4158…

Owain Evans@OwainEvans_UK

Our paper on Subliminal Learning was just published in Nature! Last July we released our preprint. It showed that LLMs can transmit traits (e.g. liking owls) through data that is unrelated to that trait (numbers that appear meaningless). What’s new?🧵

English

221

328

2.7K

498.6K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Chris Hayduk@ChrisHayduk·11 Nis

I strongly suspect that Claude Mythos is a looped language model, as described in the paper "Scaling Latent Reasoning via Looped Language Models" from ByteDance The authors of that paper called out graph search as one of the areas where looping provides a huge theoretical advantage over standard RLVR. And look at where Mythos blows out its competitors the most

English

111

358

593.7K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Andrej Karpathy@karpathy·24 Mar

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

Daniel Hnyk@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

English

1.4K

5.4K

28K

66.5M

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Fatih Dinc@fatihdin4en·12 Mar

As always, very interesting work by @scott_linderman et al! They let the dynamical system itself evolve over trials, which allows modeling representational drift. Very interesting read for the neural manifold crowd as well

bioRxiv Neuroscience@biorxiv_neursci

Stiefel Manifold Dynamical Systems for Tracking Representational Drift biorxiv.org/content/10.648… #biorxiv_neursci

English

139

9.9K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

David Clark@d_g_clark·4 Mar

I am totally pumped about this new work. "Task-trained RNNs" are a powerful and influential framework in neuroscience, but have lacked a firm theoretical footing. This work provides one, and makes direct contact with the classical theory of random RNNs. biorxiv.org/content/10.648…

English

284

20.1K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Bo Wang@BoWang87·3 Mar

Prof. Donald Knuth opened his new paper with "Shock! Shock!" Claude Opus 4.6 had just solved an open problem he'd been working on for weeks — a graph decomposition conjecture from The Art of Computer Programming. He named the paper "Claude's Cycles." 31 explorations. ~1 hour. Knuth read the output, wrote the formal proof, and closed with: "It seems I'll have to revise my opinions about generative AI one of these days." The man who wrote the bible of computer science just said that. In a paper named after an AI. Paper: cs.stanford.edu/~knuth/papers/…

English

154

1.9K

9.1K

1.4M

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Ilya Sutskever@ilyasut·27 Şub

It’s extremely good that Anthropic has not backed down, and it’s siginficant that OpenAI has taken a similar stance. In the future, there will be much more challenging situations of this nature, and it will be critical for the relevant leaders to rise up to the occasion, for fierce competitors to put their differences aside. Good to see that happen today.

English

1.4K

2.5K

25.6K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Andrew Akbashev@Andrew_Akbashev·18 Şub

A really dangerous situation. Too many submissions. Too many generated papers. Little responsibility. 1. In 2026, more than 24,000 submissions were made to the International Conference on Machine Learning (ICML). It’s TWO times more than in 2025. To fight it, the organizers now require researchers to pay $100 for every subsequent paper. 2. LLM adoption has increased researcher productivity by 90% (there’s a recent paper in Science). 3. The number of papers is becoming far too high. Submissions to arXiv have risen by 50% since 2022. 4. There are simply not enough reviewers. Plus, many scientists no longer want to invest precious time in it for free. 5. We can’t easily identify AI-made papers from the genuine ones. __ Important words from Paul Ginsparg, a co-founder of arXiv: “AI slop frequently can’t be discriminated just by looking at abstract, or even by just skimming full text. This makes it an “existential threat” to the system.” Basically, we’re getting closer to the tipping point. 📍 Many professors blame the AI. But the problem is likely elsewhere: 1. Without a sufficient number of papers, many PIs can’t get funded. They have to prove their credibility to reviewers. Their proposals have to rely on prior publications. In many countries, there are some informal (or even formal) expectations for how many papers a group with a certain size has to publish to survive (funding-wise). 2. Our students / postdocs need papers if they want to be hired in faculty roles. Yes, some departments hire people with few publications. But the majority still want to ensure their faculty can get funded. If funding is partly a function of papers, this is used in decision-making. 3. The number of papers is important if you want to get high-level awards. Many of them are not given because you published one paper (even if it’s great). They are given because you made a meaningful CONTRIBUTION to the field. How do you make it? Publish more papers. 4. Tenure promotions in many places take the number of your papers into account (often indirectly). Your tenure may get delayed if you don’t publish enough. Not everywhere, but for many mid- to low-ranked universities this story is more or less the same. + There are many more to mention. 📍My opinion: Much of this is rooted in how funding is distributed. There is a strong correlation between the requirements at a university and the funding acquisition criteria. If funding were based ONLY on the quality of published papers, universities would hire people for the quality of their science. If funding agencies strongly discouraged publishing too many papers, universities wouldn’t expect numbers from faculty during promotions. And some supervisors wouldn’t pressure students and postdocs to publish unfinished studies and low-quality data. Yes, we need good detectors of fake papers. But we also need the right policies and better funding allocation criteria.

English

372

1.4K

193.8K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Kording Lab 🦖@KordingLab·24 Kas

Let's compare our world models. I find that different people seem to have rather distinct internal world models. E.g. I personally have neither visual imagination nor an inner voice, found it weird others do. Here is a quick google forms to check idea: docs.google.com/forms/d/e/1FAI…

English

10.2K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Andrej Karpathy@karpathy·24 Kas

A number of people are talking about implications of AI to schools. I spoke about some of my thoughts to a school board earlier, some highlights: 1. You will never be able to detect the use of AI in homework. Full stop. All "detectors" of AI imo don't really work, can be defeated in various ways, and are in principle doomed to fail. You have to assume that any work done outside classroom has used AI. 2. Therefore, the majority of grading has to shift to in-class work (instead of at-home assignments), in settings where teachers can physically monitor students. The students remain motivated to learn how to solve problems without AI because they know they will be evaluated without it in class later. 3. We want students to be able to use AI, it is here to stay and it is extremely powerful, but we also don't want students to be naked in the world without it. Using the calculator as an example of a historically disruptive technology, school teaches you how to do all the basic math & arithmetic so that you can in principle do it by hand, even if calculators are pervasive and greatly speed up work in practical settings. In addition, you understand what it's doing for you, so should it give you a wrong answer (e.g. you mistyped "prompt"), you should be able to notice it, gut check it, verify it in some other way, etc. The verification ability is especially important in the case of AI, which is presently a lot more fallible in a great variety of ways compared to calculators. 4. A lot of the evaluation settings remain at teacher's discretion and involve a creative design space of no tools, cheatsheets, open book, provided AI responses, direct internet/AI access, etc. TLDR the goal is that the students are proficient in the use of AI, but can also exist without it, and imo the only way to get there is to flip classes around and move the majority of testing to in class settings.

Andrej Karpathy@karpathy

Gemini Nano Banana Pro can solve exam questions *in* the exam page image. With doodles, diagrams, all that. ChatGPT thinks these solutions are all correct except Se_2P_2 should be "diselenium diphosphide" and a spelling mistake (should be "thiocyanic acid" not "thoicyanic") :O

English

932

2.5K

16.6K

2.5M

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Nathan Lambert@natolambert·6 Kas

@sama "Building a strategic national reserve of computing power makes a lot of sense. But this should be for the government’s benefit, not the benefit of private companies." Sounds like The ATOM Project (for open models) atomproject.ai

English

168

27.3K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Google Research@GoogleResearch·7 Kas

Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: goo.gle/47LJrzI @GoogleAI

English

133

798

4.7K

1.4M

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Michael Levin@drmichaellevin·29 Eyl

Final version is out: @SantoshManicka cell.com/cell-reports-p… "Field-mediated bioelectric basis of morphogenetic prepatterning" #morphogenesis #bioelectricity #fields "Intercellular bioelectric communication plays an important role in morphogenesis, often modeled using localized non-neural networks generating spatial patterns of membrane potential (Vmem). Here, we find that the electrostatic field contributes to this process, via a synergetics (à la Haken)-based mechanism, by enhancing the complexity of Vmem patterns through a coarse-grained projection. We leverage this property of the field to automatically optimize transient signals from a symmetry-breaking organizer region in the boundary of the tissue to mold Vmem patterns in the bulk. Two models optimized in this way exhibit contrasting “mosaic” and “stigmergic” pattern-coding strategies, depending on their field sensitivity strengths. Interestingly, the stigmergic model recapitulates the qualitative developmental sequence of the bioelectric craniofacial prepattern observed in frog embryos. These results highlight the potential of the electric field both as a facilitator of collective patterning and as a macroscale interventional target for applications in regenerative medicine and bioengineering."

English

404

54.6K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Chenfeng_X@Chenfeng_X·26 Eyl

Happy to share that we have two papers got accepted by @NeurIPSConf 2025 as #Spotlight papers! 1. 👼Angles Don’t Lie: Unlocking Training-Efficient RL from a Model’s Own Signals TL;DR: Token angles—the model’s self-generated signals—can reveal how well it grasps the data. By using them to drive data sampling, you can boost RL training speed by 2–2.5× with just a few lines of code. 📚Paper: arxiv.org/pdf/2506.02281 📷Code: github.com/wangqinsi1/GAI… 2. Sparse VideoGen2 (SVG2) TL;DR: We identify that there are two main issues in existing Sparse attention methods: Inaccurate identification and Computation waste. SVG2 resolves them by grouping tokens by semantic meaning instead of position, turning a scattered, inefficient attention pattern into a dense, GPU-friendly one. 📚Paper: arxiv.org/abs/2505.18875 📷Code: github.com/svg-project/Sp… 📷svg-project.github.io/v2/ 📷Attention Kernel: docs.flashinfer.ai/api/sparse.html P.S. We also developed an interesting tool about flash-kmeans (lnkd.in/gFv_jheH) for this project! This will not only benefit VideoGen but also broader domain like Science data generation! Let's chat about it more if you have interests!

English

295

21K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Harrison Kinsley@Sentdex·15 Eyl

This is incredible

English

1.8K

23.3K

2.8M

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Prof. Anima Anandkumar@AnimaAnandkumar·16 Eyl

Physics-AI that can generalize across different complex 3D geometries is a challenging problem. We propose a principled solution combining Neural Operators with Optimal Transport. Optimal transport provides determines the most efficient transformation between two densities. By interpreting surface meshes as continuous density functions, we formulate the geometry embedding problem as an optimal transport problem that maps these mesh density functions to uniform density functions on a sphere. Optimal transport inherently preserves the structural properties of the mesh while ensuring a smooth, physically meaningful transformation. Combining this with Neural Operators allows us to generalize geometry learning from discretized mesh points to mesh density function, and work across resolutions. arxiv.org/pdf/2507.20065

English

547

35.4K

ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว

Mackenzie Weygandt Mathis, PhD@TrackingActions·25 Ağu

My lab has been pushing into explainable, robust, & theoretically-tractable AI models for science 💪 🚨At #ICCV2025 we introduce #DISTIL - led by amazing PhD student @hsirm96 - we propose a trigger-inversion method for DNNs that reconstructs malicious backdoor triggers 1/4

Mackenzie Weygandt Mathis, PhD tweet media

English

4.1K

ℜ𝗒an (sigmoid.social/@SynapticSage)@SynapticSage·25 Ağu

@TygerLucas @heyrimsha Or affect various actuators. Andrew Ng is credited with coining the term AFAIK for LLMs.

English

ℜ𝗒an (sigmoid.social/@SynapticSage)@SynapticSage·25 Ağu

@TygerLucas @heyrimsha Capable of taking actions on systems. Not just a chat interface. Usually agentic implies there’s an interface attached so that text tokens can trigger write/read events somewhere.

English

Rimsha Bhardwaj@heyrimsha·24 Ağu

🚨 BREAKING: NVIDIA just exposed the dirty secret about LLMs. Their new paper proves SLMs outperform massive models in real-world applications. AI researchers are quietly pivoting overnight. 10 wild findings that change everything:

English

574

3.8K

474.7K

ค้นพบ

@ClopathLab @Nature @scott_linderman @sama @GoogleAI @SantoshManicka @NeurIPSConf @hsirm96