❖Prisma Dimensional❖

6.3K posts

❖Prisma Dimensional❖ banner
❖Prisma Dimensional❖

❖Prisma Dimensional❖

@PrismaDimens

🔥 Code geek | 🚀 Billions of iterations in C for fun | 💻 Linux | 🤖 Dreaming of quantum ASICs while laughing & creating wild AI models.

P=NP México Katılım Ağustos 2024
969 Takip Edilen336 Takipçiler
Sabitlenmiş Tweet
❖Prisma Dimensional❖
❖Prisma Dimensional❖@PrismaDimens·
Title: Generating Neural Networks with Hypernetworks: A MNIST Experiment Introduction Imagine training one neural network that can instantly generate other neural networks tailored to specific tasks, no retraining required. That’s the promise of hypernetworks, a meta-learning technique where a "parent" network produces the weights for a "child" network. In this experiment, I used a hypernetwork to generate autoencoders for reconstructing MNIST digits, exploring variations, minimal forms, and combinations. Here’s how it works and why it’s exciting. The Setup: Autoencoders and MNIST An autoencoder is a simple neural network that compresses data (e.g., a 28x28 MNIST image) into a smaller latent space (64 dimensions here) and then reconstructs it. I trained multiple autoencoders on the MNIST dataset of handwritten digits (0–9), but with a twist: Variations: For each digit, I trained three autoencoders on different styles (e.g., slanted or thick '1's), identified via K-means clustering. Minimal Forms: One autoencoder per digit captured its "average" or canonical version. Combinations: Ten autoencoders handled specific digit groups (e.g., 0 and 1, or 0, 2, and 4). This gave me 50 autoencoders (10 digits × 4 models each + 10 combinations), each with weights optimized for its task. The Hypernetwork: A Weight Factory Instead of storing 50 separate models, I trained a single hypernetwork to generate their weights on demand. Here’s the process: Input: A 23-dimensional vector encoding: Digit ID (10D one-hot, e.g., [0, 1, 0, ...] for digit 1). Variation ID (3D one-hot, e.g., [0, 1, 0] for variation 1, or zeros for minimal). Combination ID (10D multi-hot, e.g., [1, 0, 1, 0, ...] for digits 0 and 2). Output: A 100,576-element tensor, flattened weights for an autoencoder (computed as 784×64 + 64 + 64×784 + 784 for its layers). Training: The hypernetwork learned to map these inputs to the weights of the 50 trained autoencoders using mean squared error loss, running on GPU for speed. Making It Work: Inference Without Retraining Once trained, the hypernetwork acts like a factory. Want an autoencoder for digit 1, variation 1? Feed it [0, 1, 0, ..., 0, 1, 0, 0, 0, 0, 0] (digit 1 + variation 1), and it outputs weights. These weights are then loaded into a fresh autoencoder: A loop iterates over the autoencoder’s parameters (e.g., encoder weights, biases), reshaping chunks of the hypernetwork’s output to match each layer’s shape. The result? A fully parameterized autoencoder ready to reconstruct images, no training needed. I tested it with three cases: Digit 1, Variation 1: Reconstructed stylized '1's. Digit 1, Minimal: Produced a clean, average '1'. Digits 0, 2, 4: Handled a mix of digits from one trained combination. Visualizations showed the inputs and outputs side-by-side, proof the concept works! Why This Matters Efficiency: One hypernetwork replaces dozens of models. Generate weights in a single forward pass (milliseconds) instead of training from scratch (minutes). Flexibility: Control digit styles or combinations with a simple input tweak. Scalability: Imagine extending this to bigger networks or tasks, hypernetworks could dynamically adapt models on-the-fly. Challenges and Next Steps Generalization: The hypernetwork is tied to the 50 scenarios it learned. To handle new combinations (e.g., 1, 5, 7), I’d expand the training data with more digit mixes. Complexity: Outputting 100,576 weights limits scalability. A deeper hypernetwork could struggle with bigger models, so I’d explore scaling down the autoencoder (e.g., smaller latent space) for efficiency or scaling up to handle complex tasks, balancing size and performance. Beyond Weights: Right now, it generates weights for a fixed autoencoder architecture. Next, I’d upgrade the hypernetwork to design the architecture too, say, predicting layer sizes or types (e.g., adding convolutions). This could mean outputting a variable-length tensor describing both structure and weights, pushing it toward true model generation. Quality: Reconstructions were solid but blurry. More epochs, a beefier hypernetwork, or architecture tweaks (e.g., deeper layers) could sharpen them. Future experiments? Train on all digit variations, test unseen combos, scale the model up or down, and let the hypernetwork dream up architectures, maybe even swap in a classifier to predict labels instead of reconstructing. Conclusion This MNIST experiment shows hypernetworks can generate functional neural networks instantly, blending creativity (variations) with practicality (combinations). It’s a step toward a future where models are built dynamically, not trained individually. Code’s below, try it out, tweak it, and let me know what you think on X!
❖Prisma Dimensional❖ tweet media❖Prisma Dimensional❖ tweet media❖Prisma Dimensional❖ tweet media❖Prisma Dimensional❖ tweet media
English
2
4
12
3.4K
❖Prisma Dimensional❖ retweetledi
martin_casado
martin_casado@martin_casado·
Very, very impressive. Raycasting for collisions without a mesh directly on a splat.
𝐑𝐔𝐁𝐄𝐍🥽𝐅𝐑𝐎@rubenfro

Been experimenting with procedural locomotion on #GaussianSplat environments from @theworldlabs . Built a little multi-legged robot in Unity that raycasts against the splat data to figure out where to place its feet, no meshes or colliders involved... Pretty fun to climb surfaces, walk on walls, and set a weird number of legs on the fly :) Still rough around the edges but pretty fun to watch it figure things out. #GaussianSplatting #Unity3D #WorldLabs #ProceduralAnimation

English
6
12
167
15.3K
❖Prisma Dimensional❖
❖Prisma Dimensional❖@PrismaDimens·
Deploying the update… or deploying problems? LiteLLM
❖Prisma Dimensional❖ tweet media
Andrej Karpathy@karpathy

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

English
0
0
0
17
❖Prisma Dimensional❖
❖Prisma Dimensional❖@PrismaDimens·
It's time to reduce reliance on external dependencies and speculative feature planning. Instead, we should leverage AI to generate solutions from scratch that align with specific requirements, potentially eliminating the complexity of dependency relationships.
Andrej Karpathy@karpathy

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

English
0
0
0
8
❖Prisma Dimensional❖ retweetledi
Scottie Pippen
Scottie Pippen@ScottiePippen·
AGI isn’t scary. Being late is.
English
129
97
1.5K
218.4K
❖Prisma Dimensional❖ retweetledi
Polymarket
Polymarket@Polymarket·
BREAKING: NVIDIA CEO announces “we’ve achieved AGI”
English
1.7K
2.3K
20.6K
6.5M
❖Prisma Dimensional❖ retweetledi
X Freeze
X Freeze@XFreeze·
Grok Imagine generates the best realistic videos that feel truly lifelike ✨
English
497
138
1.2K
17.4M
❖Prisma Dimensional❖ retweetledi
Sudo su
Sudo su@sudoingX·
if you have a 12gb graphics card collecting dust in an old gaming rig or workstation, read this. i ran a 9 billion parameter model on a single RTX 3060. 50 tokens per second. it wrote a full space shooter from scratch, 3,263 lines across 13 files. zero handwritten code. zero cloud. zero API calls. your data never left the machine. not once. you're probably sitting on more local intelligence than you realize. stop paying per token for work that should stay private.
Sudo su@sudoingX

x.com/i/article/2034…

English
57
88
1.1K
77.9K
❖Prisma Dimensional❖ retweetledi
Timothy Kassis
Timothy Kassis@TimothyKassis·
We open sourced a version of the world's most capable AI co-scientist. Free. Easy to install. Has access to our Scientific Skills that are in use by 150k+ scientists worldwide. Please star the GitHub repo and repost/retweet.
K-Dense@k_dense_ai

We just open-sourced K-Dense BYOK, your own AI research assistant, running locally with your API keys. 170+ scientific skills. 250+ databases. 40+ models. Scalable compute via @modal when you need it. No subscriptions. No lock-in. Data stays on your computer. Repost, star and try it now: github.com/K-Dense-AI/k-d…

English
4
121
947
115.8K
❖Prisma Dimensional❖ retweetledi
TheGameVerse
TheGameVerse@TheGameVerse·
One of the coolest mechanism in gaming history.
English
160
492
12.7K
1M
❖Prisma Dimensional❖ retweetledi
djcows
djcows@djcows·
startup idea: submerged GPUs to heat the water to create steam to spin turbines to generate electricity to power the GPUs
djcows tweet media
English
1K
695
24.9K
2M
❖Prisma Dimensional❖ retweetledi
Sawyer Merritt
Sawyer Merritt@SawyerMerritt·
Elon Musk: "This chart explains why we need to build the TERAFAB."
Sawyer Merritt tweet media
English
246
859
4.9K
1.1M
❖Prisma Dimensional❖ retweetledi
Valeriy M., PhD, MBA, CQF
Valeriy M., PhD, MBA, CQF@predict_addict·
Mexico is the absolute outlier in the OECD: workers log the most hours on Earth (~2,200+ annually) yet deliver relatively little economic output per hour—despite huge advantages like proximity to the USA. Something went seriously wrong. My take: the education system. Even grads from top STEM unis often have shockingly weak fundamentals (based on interviews). Sure, brilliant Mexicans exist, but the system fails the average citizen badly. Work smarter, not longer. Fix education → unlock potential. Study math, just ask Peru 🇵🇪
Valeriy M., PhD, MBA, CQF tweet media
English
36
683
2.1K
123.5K
❖Prisma Dimensional❖ retweetledi
Elon Musk
Elon Musk@elonmusk·
Good explanation of nihilist philosophy
English
6.2K
29.4K
146.5K
68.1M
❖Prisma Dimensional❖ retweetledi
Guri Singh
Guri Singh@heygurisingh·
🚨Architects are going to hate this. Someone just open sourced a full 3D building editor that runs entirely in your browser. No AutoCAD. No Revit. No $5,000/year licenses. It's called Pascal Editor. Built with React Three Fiber and WebGPU -- meaning it renders directly on your GPU at near-native speed. Here's what's inside this thing: → A full building/level/wall/zone hierarchy you can edit in real time → An ECS-style architecture where every object updates through GPU-powered systems → Zustand state management with full undo/redo built in → Next.js frontend so it deploys as a web app, not a desktop install → Dirty node tracking -- only re-renders what changed, not the whole scene Here's the wildest part: You can stack, explode, or solo individual building levels. Select a zone, drag a wall, reshape a slab -- all in 3D, all in the browser. Architecture firms pay $50K+ per seat for BIM software that does this workflow. This is free. 100% Open Source.
English
684
4.8K
32.2K
2.8M
❖Prisma Dimensional❖ retweetledi