ℜ𝗒an (sigmoid.social/@SynapticSage)

328 posts

ℜ𝗒an (sigmoid.social/@SynapticSage) banner
ℜ𝗒an (sigmoid.social/@SynapticSage)

ℜ𝗒an (sigmoid.social/@SynapticSage)

@SynapticSage

Neural network on a quest for cat videos, brain-inspired ML, CS, and math. Probably in that order. Ph.D. Comp/Sys Neuro, HPC-CTX research; ML/AI engineer

Waltham, MA เข้าร่วม Haziran 2012
707 กำลังติดตาม139 ผู้ติดตาม
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Anthropic
Anthropic@AnthropicAI·
Research we co-authored on subliminal learning—how LLMs can pass on traits like preferences or misalignment through hidden signals in data—was published today in @Nature. Read the paper: nature.com/articles/s4158…
Owain Evans@OwainEvans_UK

Our paper on Subliminal Learning was just published in Nature! Last July we released our preprint. It showed that LLMs can transmit traits (e.g. liking owls) through data that is unrelated to that trait (numbers that appear meaningless). What’s new?🧵

English
221
328
2.7K
498.6K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Chris Hayduk
Chris Hayduk@ChrisHayduk·
I strongly suspect that Claude Mythos is a looped language model, as described in the paper "Scaling Latent Reasoning via Looped Language Models" from ByteDance The authors of that paper called out graph search as one of the areas where looping provides a huge theoretical advantage over standard RLVR. And look at where Mythos blows out its competitors the most
Chris Hayduk tweet media
English
111
358
4K
593.7K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Andrej Karpathy
Andrej Karpathy@karpathy·
Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.
Daniel Hnyk@hnykda

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

English
1.4K
5.4K
28K
66.5M
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Fatih Dinc
Fatih Dinc@fatihdin4en·
As always, very interesting work by @scott_linderman et al! They let the dynamical system itself evolve over trials, which allows modeling representational drift. Very interesting read for the neural manifold crowd as well
Fatih Dinc tweet media
bioRxiv Neuroscience@biorxiv_neursci

Stiefel Manifold Dynamical Systems for Tracking Representational Drift biorxiv.org/content/10.648… #biorxiv_neursci

English
0
15
139
9.9K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
David Clark
David Clark@d_g_clark·
I am totally pumped about this new work. "Task-trained RNNs" are a powerful and influential framework in neuroscience, but have lacked a firm theoretical footing. This work provides one, and makes direct contact with the classical theory of random RNNs. biorxiv.org/content/10.648…
David Clark tweet media
English
4
52
284
20.1K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Bo Wang
Bo Wang@BoWang87·
Prof. Donald Knuth opened his new paper with "Shock! Shock!" Claude Opus 4.6 had just solved an open problem he'd been working on for weeks — a graph decomposition conjecture from The Art of Computer Programming. He named the paper "Claude's Cycles." 31 explorations. ~1 hour. Knuth read the output, wrote the formal proof, and closed with: "It seems I'll have to revise my opinions about generative AI one of these days." The man who wrote the bible of computer science just said that. In a paper named after an AI. Paper: cs.stanford.edu/~knuth/papers/…
Bo Wang tweet media
English
154
1.9K
9.1K
1.4M
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Ilya Sutskever
Ilya Sutskever@ilyasut·
It’s extremely good that Anthropic has not backed down, and it’s siginficant that OpenAI has taken a similar stance. In the future, there will be much more challenging situations of this nature, and it will be critical for the relevant leaders to rise up to the occasion, for fierce competitors to put their differences aside. Good to see that happen today.
English
1.4K
2.5K
25.6K
3M
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Andrew Akbashev
Andrew Akbashev@Andrew_Akbashev·
A really dangerous situation. Too many submissions. Too many generated papers. Little responsibility. 1. In 2026, more than 24,000 submissions were made to the International Conference on Machine Learning (ICML). It’s TWO times more than in 2025. To fight it, the organizers now require researchers to pay $100 for every subsequent paper. 2. LLM adoption has increased researcher productivity by 90% (there’s a recent paper in Science). 3. The number of papers is becoming far too high. Submissions to arXiv have risen by 50% since 2022. 4. There are simply not enough reviewers. Plus, many scientists no longer want to invest precious time in it for free. 5. We can’t easily identify AI-made papers from the genuine ones. __ Important words from Paul Ginsparg, a co-founder of arXiv: “AI slop frequently can’t be discriminated just by looking at abstract, or even by just skimming full text. This makes it an “existential threat” to the system.” Basically, we’re getting closer to the tipping point. 📍 Many professors blame the AI. But the problem is likely elsewhere: 1. Without a sufficient number of papers, many PIs can’t get funded. They have to prove their credibility to reviewers. Their proposals have to rely on prior publications. In many countries, there are some informal (or even formal) expectations for how many papers a group with a certain size has to publish to survive (funding-wise). 2. Our students / postdocs need papers if they want to be hired in faculty roles. Yes, some departments hire people with few publications. But the majority still want to ensure their faculty can get funded. If funding is partly a function of papers, this is used in decision-making. 3. The number of papers is important if you want to get high-level awards. Many of them are not given because you published one paper (even if it’s great). They are given because you made a meaningful CONTRIBUTION to the field. How do you make it? Publish more papers. 4. Tenure promotions in many places take the number of your papers into account (often indirectly). Your tenure may get delayed if you don’t publish enough. Not everywhere, but for many mid- to low-ranked universities this story is more or less the same. + There are many more to mention. 📍My opinion: Much of this is rooted in how funding is distributed. There is a strong correlation between the requirements at a university and the funding acquisition criteria. If funding were based ONLY on the quality of published papers, universities would hire people for the quality of their science. If funding agencies strongly discouraged publishing too many papers, universities wouldn’t expect numbers from faculty during promotions. And some supervisors wouldn’t pressure students and postdocs to publish unfinished studies and low-quality data. Yes, we need good detectors of fake papers. But we also need the right policies and better funding allocation criteria.
Andrew Akbashev tweet media
English
94
372
1.4K
193.8K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Kording Lab 🦖
Kording Lab 🦖@KordingLab·
Let's compare our world models. I find that different people seem to have rather distinct internal world models. E.g. I personally have neither visual imagination nor an inner voice, found it weird others do. Here is a quick google forms to check idea: docs.google.com/forms/d/e/1FAI…
English
16
15
84
10.2K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Andrej Karpathy
Andrej Karpathy@karpathy·
A number of people are talking about implications of AI to schools. I spoke about some of my thoughts to a school board earlier, some highlights: 1. You will never be able to detect the use of AI in homework. Full stop. All "detectors" of AI imo don't really work, can be defeated in various ways, and are in principle doomed to fail. You have to assume that any work done outside classroom has used AI. 2. Therefore, the majority of grading has to shift to in-class work (instead of at-home assignments), in settings where teachers can physically monitor students. The students remain motivated to learn how to solve problems without AI because they know they will be evaluated without it in class later. 3. We want students to be able to use AI, it is here to stay and it is extremely powerful, but we also don't want students to be naked in the world without it. Using the calculator as an example of a historically disruptive technology, school teaches you how to do all the basic math & arithmetic so that you can in principle do it by hand, even if calculators are pervasive and greatly speed up work in practical settings. In addition, you understand what it's doing for you, so should it give you a wrong answer (e.g. you mistyped "prompt"), you should be able to notice it, gut check it, verify it in some other way, etc. The verification ability is especially important in the case of AI, which is presently a lot more fallible in a great variety of ways compared to calculators. 4. A lot of the evaluation settings remain at teacher's discretion and involve a creative design space of no tools, cheatsheets, open book, provided AI responses, direct internet/AI access, etc. TLDR the goal is that the students are proficient in the use of AI, but can also exist without it, and imo the only way to get there is to flip classes around and move the majority of testing to in class settings.
Andrej Karpathy@karpathy

Gemini Nano Banana Pro can solve exam questions *in* the exam page image. With doodles, diagrams, all that. ChatGPT thinks these solutions are all correct except Se_2P_2 should be "diselenium diphosphide" and a spelling mistake (should be "thiocyanic acid" not "thoicyanic") :O

English
932
2.5K
16.6K
2.5M
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Nathan Lambert
Nathan Lambert@natolambert·
@sama "Building a strategic national reserve of computing power makes a lot of sense. But this should be for the government’s benefit, not the benefit of private companies." Sounds like The ATOM Project (for open models) atomproject.ai
English
5
17
168
27.3K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Google Research
Google Research@GoogleResearch·
Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: goo.gle/47LJrzI @GoogleAI
Google Research tweet media
English
133
798
4.7K
1.4M
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Michael Levin
Michael Levin@drmichaellevin·
Final version is out: @SantoshManicka cell.com/cell-reports-p… "Field-mediated bioelectric basis of morphogenetic prepatterning" #morphogenesis #bioelectricity #fields "Intercellular bioelectric communication plays an important role in morphogenesis, often modeled using localized non-neural networks generating spatial patterns of membrane potential (Vmem). Here, we find that the electrostatic field contributes to this process, via a synergetics (à la Haken)-based mechanism, by enhancing the complexity of Vmem patterns through a coarse-grained projection. We leverage this property of the field to automatically optimize transient signals from a symmetry-breaking organizer region in the boundary of the tissue to mold Vmem patterns in the bulk. Two models optimized in this way exhibit contrasting “mosaic” and “stigmergic” pattern-coding strategies, depending on their field sensitivity strengths. Interestingly, the stigmergic model recapitulates the qualitative developmental sequence of the bioelectric craniofacial prepattern observed in frog embryos. These results highlight the potential of the electric field both as a facilitator of collective patterning and as a macroscale interventional target for applications in regenerative medicine and bioengineering."
English
22
84
404
54.6K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Chenfeng_X
Chenfeng_X@Chenfeng_X·
Happy to share that we have two papers got accepted by @NeurIPSConf 2025 as #Spotlight papers! 1. 👼Angles Don’t Lie: Unlocking Training-Efficient RL from a Model’s Own Signals TL;DR: Token angles—the model’s self-generated signals—can reveal how well it grasps the data. By using them to drive data sampling, you can boost RL training speed by 2–2.5× with just a few lines of code. 📚Paper: arxiv.org/pdf/2506.02281 📷Code: github.com/wangqinsi1/GAI… 2. Sparse VideoGen2 (SVG2) TL;DR: We identify that there are two main issues in existing Sparse attention methods: Inaccurate identification and Computation waste. SVG2 resolves them by grouping tokens by semantic meaning instead of position, turning a scattered, inefficient attention pattern into a dense, GPU-friendly one. 📚Paper: arxiv.org/abs/2505.18875 📷Code: github.com/svg-project/Sp… 📷svg-project.github.io/v2/ 📷Attention Kernel: docs.flashinfer.ai/api/sparse.html P.S. We also developed an interesting tool about flash-kmeans (lnkd.in/gFv_jheH) for this project! This will not only benefit VideoGen but also broader domain like Science data generation! Let's chat about it more if you have interests!
Chenfeng_X tweet mediaChenfeng_X tweet media
English
18
32
295
21K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Harrison Kinsley
Harrison Kinsley@Sentdex·
This is incredible
English
1K
1.8K
23.3K
2.8M
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Prof. Anima Anandkumar
Prof. Anima Anandkumar@AnimaAnandkumar·
Physics-AI that can generalize across different complex 3D geometries is a challenging problem. We propose a principled solution combining Neural Operators with Optimal Transport. Optimal transport provides determines the most efficient transformation between two densities. By interpreting surface meshes as continuous density functions, we formulate the geometry embedding problem as an optimal transport problem that maps these mesh density functions to uniform density functions on a sphere. Optimal transport inherently preserves the structural properties of the mesh while ensuring a smooth, physically meaningful transformation. Combining this with Neural Operators allows us to generalize geometry learning from discretized mesh points to mesh density function, and work across resolutions. arxiv.org/pdf/2507.20065
Prof. Anima Anandkumar tweet media
English
15
63
547
35.4K
ℜ𝗒an (sigmoid.social/@SynapticSage) รีทวีตแล้ว
Mackenzie Weygandt Mathis, PhD
Mackenzie Weygandt Mathis, PhD@TrackingActions·
My lab has been pushing into explainable, robust, & theoretically-tractable AI models for science 💪 🚨At #ICCV2025 we introduce #DISTIL - led by amazing PhD student @hsirm96 - we propose a trigger-inversion method for DNNs that reconstructs malicious backdoor triggers 1/4
Mackenzie Weygandt Mathis, PhD tweet media
English
2
10
38
4.1K
Rimsha Bhardwaj
Rimsha Bhardwaj@heyrimsha·
🚨 BREAKING: NVIDIA just exposed the dirty secret about LLMs. Their new paper proves SLMs outperform massive models in real-world applications. AI researchers are quietly pivoting overnight. 10 wild findings that change everything:
Rimsha Bhardwaj tweet media
English
85
574
3.8K
474.7K