
🇨🇳🇺🇸 This is what discipline looks like… A Chinese soldier doesn't budge an inch as Air Force One taxis just feet away from his post. A bit closer and we’d have had an international incident before Trump even stepped off the plane 😂
🐉
1.2K posts

@brian0x1f409
Non-conformist | Existential | Stoic | ML/AI | Programming Languages Nomad | Travelling the seekers' path | Pushing the frontiers of AGI, one RL at a time.

🇨🇳🇺🇸 This is what discipline looks like… A Chinese soldier doesn't budge an inch as Air Force One taxis just feet away from his post. A bit closer and we’d have had an international incident before Trump even stepped off the plane 😂





I am a Professor of Architecture. I have been a judge for Commonwealth Association of Architects Awards, International Union of Architects Awards, Asia Architecture Awards, AAK-Crown Architecture Awards, etc. Feel free to ignore my opinion. The New State House is plain MEDIOCRE!

ALLEGEDLY Step 0: Say something not true on tiktok Step 1: Take someone to court for correcting you Step 2: Say controversial stuff and ragebaits about AI Step 3: Be a tech events merchant Step 4: Collaborate with famous devs Step 5: Sell a course ALLEGEDLY Keynotes: Have no github/gitlab/bitbucket to show for it,no record of previous works, no freelance evidence, no current employment evidence. But we are the ones delusional.




Today we release Token Superposition Training (TST), a modification to the standard LLM pretraining loop that produces a 2-3× wall-clock speedup at matched FLOPs without changing the model architecture, optimizer, tokenizer, or training data. During the first third of training, the model reads and predicts contiguous bags of tokens, averaging their embeddings on the input side and predicting the next bag with a modified cross-entropy on the output side. For the remainder of the run, it trains normally on next-token prediction. The inference-time model is identical to one produced by conventional pretraining. Validated at 270M, 600M, and 3B dense scales, and at 10B-A1B MoE. The work on TST was led by @bloc97_, @gigant_theo, and @theemozilla.



Unfortunately the broke men are loving this and I just want to clarify that I’m still a firm believer in getting men’s money but use it to build yourself.
