
Dylan Tull
282 posts

Dylan Tull
@dylantull
𝔠𝔯𝔢𝔞𝔱𝔦𝔳𝔢 𝔰𝔱𝔯𝔞𝔱𝔢𝔤𝔦𝔰𝔱 ⸻ 𝔱𝔥𝔢𝔬𝔯𝔦𝔰𝔱 ⸻ 𝔴𝔥𝔬𝔩𝔢-𝔰𝔶𝔰𝔱𝔢𝔪𝔰 𝔡𝔢𝔰𝔦𝔤𝔫𝔢𝔯





A longstanding dream of interp is to decompose activations into distinct, interpretable parts. But when should we expect that to work, and what even are such parts? New from Simplex: transformers factor their world into orthogonal subspaces, even when it costs accuracy.🧵👇


Fractal viroid hub with @threejs


🚨 UPDATE: Mini Shai-Hulud has crossed from @npmjs into @pypi and is still spreading. Newly confirmed compromised artifacts: @opensearch-project/opensearch: 3.5.3, 3.6.2, 3.7.0, 3.8.0 (1.3M weekly downloads) mistralai: 2.4.6 on PyPI guardrails-ai: 0.10.1 on PyPI additional @squawk/* packages on npm guardrails-ai 0.10.1 executes malicious code on import. On Linux, it downloads git-tanstack[.]com/transformers.pyz, writes it to /tmp/transformers.pyz, and runs it with python3 without integrity verification. The git-tanstack.com domain displayed a message signed “With Love TeamPCP,” along with: “We've been online over 2 hours now stealing creds Regardless I just came to say hello :^)” The page also linked to a YouTube video and you can probably guess which one.





are we supposed to never question this?











