KDXB2000

13.3K posts

KDXB2000

KDXB2000

@kuiperDXB

If the action integral of a system is invariant under a continuous group of transformations then there exists a corresponding conserved quantity -Noether, 1918

Katılım Mart 2022
7.5K Takip Edilen453 Takipçiler
KDXB2000 retweetledi
Probability and Statistics
Paper Summary: On the Difficulty of Learning Chaotic Dynamics with RNNs (Mikhaeil, @Zahra__Monfared, Durstewitz) This paper studies why recurrent neural networks (RNNs) struggle to learn chaotic dynamical systems, despite their theoretical expressiveness. The key issue is that chaotic systems exhibit extreme sensitivity to initial conditions—small prediction errors grow exponentially over time, making long-term forecasting unstable. The authors show that standard training (e.g., teacher forcing) leads to a mismatch between training and inference, causing error accumulation and divergence from true trajectories. Even small modeling inaccuracies result in qualitatively incorrect dynamics. They analyze this through dynamical systems theory and demonstrate that learning stable attractors and invariant structures is far more important than minimizing short-term prediction loss. Key insight: accurate learning of chaotic systems requires preserving underlying dynamics, not just fitting data. This has implications for ML, physics modeling, climate prediction, and RL environments with complex temporal behavior. proceedings.neurips.cc/paper_files/pa…
Probability and Statistics tweet media
English
1
6
46
2.5K
Ishaan Tharoor
Ishaan Tharoor@ishaantharoor·
Having just seen long lines for cooking gas cylinders on the streets of cities in India — one of many bystanders to the conflict whose population has been seriously impacted—I’m struck yet again by how relatively insulated the U.S. is from the wars it unleashes
Jesse Watters@JesseBWatters

Spring Break goes WILD☀️ 🍺🤪 and the students have NO IDEA what’s going on🤣 “The BIGGEST issue in America is what BIKINI I’m wearing tomorrow”👙 “We’re going to war with IRAQ that’s been crazy”🤔 “I’ve NEVER heard the word Ayatollah in my life”🫢 “Is Venezuela in SPAIN?”😬😬😬

English
148
1.5K
8.1K
446.6K
KDXB2000 retweetledi
ViralRush ⚡
ViralRush ⚡@tweetciiiim·
The stunning beauty of an Altay sunset
English
6
38
66
2.2K
Shaw - rus/cos
Shaw - rus/cos@billy_boi6·
@ishaantharoor Funnily this video is not of an average American, rather daddys money spring breakers.
English
0
1
47
5.5K
KDXB2000
KDXB2000@kuiperDXB·
@Namasteankit @ishaantharoor Not really…they will reduce their interest rates and inflate the debt away. Managed properly it can be done. Albeit Not by a baboon like Trump tho,
English
0
0
1
6
Ankit Khandelwal
Ankit Khandelwal@Namasteankit·
@ishaantharoor Mounting debt will crush their future anyway. US government is still borrowing, spending, and cutting the essential services budget.
English
0
0
0
417
KDXB2000 retweetledi
Math Cafe
Math Cafe@Riazi_Cafe_en·
Gelfand's proof for Cauchy-Schwarz Inequality for real inner product spaces Source: I.M. Gelfand, "Lectures on Linear Algebra" (archive.org/details/lectur…).
Math Cafe tweet media
English
3
67
488
48.6K
KDXB2000 retweetledi
Rohan Paul
Rohan Paul@rohanpaul_ai·
This paper shows a collection of case studies demonstrating how researchers have successfully collaborated with advanced AI models, specifically Google’s Gemini-based models (in particular Gemini Deep Think and its advanced variants), to solve open problems. Core finding is that the model works best as a research partner, where people break problems into pieces, challenge the model’s mistakes, ask for counterexamples or proof ideas, and sometimes connect it to code that checks whether the math really works. Instead of one benchmark, the authors show many case studies across theoretical computer science, economics, optimization, and physics, including finding counterexamples, spotting a fatal proof bug in a cryptography paper, and helping prove new results. The main result is not one single theorem but a pattern: with strong human guidance and repeated checking, the model sometimes helped push research forward on open problems that normally need expert-level mathematical work. ---- Paper Link – arxiv. org/abs/2602.03837 Paper Title: "Accelerating Scientific Research with Gemini: Case Studies and Common Techniques"
Rohan Paul tweet media
English
7
38
111
6.4K
KDXB2000
KDXB2000@kuiperDXB·
@AngelicaOung @teortaxesTex I doubt a single nuke will lead to the regime falling or surrendering. The Japanese only surrender due to the Soviet threat not the nukes. Albeit the nukes did make them sit up and take note.
English
0
0
2
11
Angelica 🌐⚛️🇹🇼🇨🇳🇺🇸
A big swing and a miss from both Teortaxes and policytensor. First, Teor is correct in that there’s no magic kit China can airdrop in Iran to make them win the war. Even if there were, they wouldn’t. But there is also no way for Israel to make Iran fall. Even if they are totally evil and willing to kill as many Iranians as they can. The last I checked, they haven’t even managed to get all of Hamas out of Gaza. Sure sure they’ll go on a killing spree. Maybe even drop a nuke. Well do they have enough nukes to kill ALL the Iranians? Because if not they haven’t solved their problem, just created new ones. Iran will not fall. Israel and America will not stop. I honestly don’t know how it ends except things are going to get a lot worse. Then America will stop and when America stops Israel will too. “Oh Angelica but all the GCC countries are going to turn on Iran too…” if things get bad bad GCC ppl are going to turn on their governments before Iranians do. Ultimately this is existential for the Iranians but not for Americans. Maybe it’s kinda existential for Israel, but believe it or not their control over the US only goes so far.
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

This is how you know the guy is going off the third worldist deep end. I find much of @policytensor's analysis neat but let's be serious. This is capeshit. What weapons can China "airdrop" to make a difference? To whom, even? And why would they? Iran will fall, sorry.

English
24
9
139
14K
KDXB2000 retweetledi
Nav Toor
Nav Toor@heynavtoor·
🚨 Meta, Google DeepMind, and OpenAI all ask the same thing in ML interviews: "Implement softmax from scratch.." Most candidates fail. Someone just open sourced the training ground for it. It's called TorchCode. LeetCode, but for PyTorch. 39 problems that test the exact skills top AI labs hire for. No tutorials. No hand-holding. Implement it or fail. Instant auto-grading. Here's what's inside this thing: → Implement ReLU, softmax, LayerNorm, dropout from scratch → Build multi-head attention, full Transformer blocks, GPT-2 → Automated judge checks correctness, gradients, and timing → Colored pass/fail per test case like competitive programming → Hints when you're stuck. Full reference solutions after you try. → Progress tracking. What you solved, best times, attempt counts. → Runs in your browser. No GPU needed. No signup. No cloud. Here's the wildest part: Every problem is a real interview question from top AI companies. You're not learning theory. You're practicing the exact exercises that get people $400K+ offers at Meta AI, DeepMind, and OpenAI. Try it right now on Hugging Face. Zero install. Opens in your browser. ML bootcamps charge $10,000 to $30,000 to teach this. Interview prep courses charge $2,000+. This is free. 776 GitHub stars. MIT License. 100% Open Source.
Nav Toor tweet media
English
25
101
721
75.1K
KDXB2000 retweetledi
Probability and Statistics
Here are 100 hard ML interview questions you must not miss out: --- Theory & Fundamentals: 1. Explain bias–variance tradeoff mathematically. 2. When does ERM fail? 3. Derive ridge regression solution. 4. Why does L1 induce sparsity? 5. VC dimension intuition? 6. PAC learning basics? 7. When is MLE inconsistent? 8. Difference: MAP vs MLE? 9. Curse of dimensionality formal meaning? 10. When is KL divergence symmetric? --- Optimization: 11. Why does SGD generalize better? 12. Saddle point vs local minima? 13. Adam vs SGD tradeoffs? 14. Convergence of GD conditions? 15. Why gradient vanishing occurs? 16. Exploding gradients fixes? 17. Hessian interpretation? 18. Natural gradient intuition? 19. Why sharp minima generalize worse? 20. Line search vs fixed LR? --- Probabilistic ML: 21. Derive EM algorithm. 22. When does EM fail? 23. Gibbs vs Metropolis? 24. Variational inference idea? 25. ELBO derivation? 26. When is VI biased? 27. Exchangeability meaning? 28. Bayesian nonparametrics idea? 29. Prior sensitivity? 30. Posterior collapse in VAEs? --- Deep Learning: 31. Why transformers beat RNNs? 32. Attention complexity issue? 33. Positional encoding role? 34. Layer norm vs batch norm? 35. Residual connections math? 36. Why overparameterization works? 37. Lottery ticket hypothesis? 38. Double descent curve? 39. Scaling laws intuition? 40. Why ReLU dominates? --- CNNs & Vision: 41. Why convolution works? 42. Translation invariance proof? 43. Pooling necessity? 44. Dilated conv use? 45. Feature hierarchy explanation? --- NLP: 46. Word2Vec objective? 47. Negative sampling role? 48. BERT vs GPT difference? 49. Tokenization impact? 50. Hallucination causes? --- RL: 51. Bellman equation derivation? 52. Policy vs value iteration? 53. Why Q-learning unstable? 54. Exploration vs exploitation math? 55. Actor-critic intuition? 56. Credit assignment problem? 57. Off-policy vs on-policy? 58. Function approximation issues? 59. Reward shaping risks? 60. Why RL is sample inefficient? --- Generalization & Theory: 61. Why deep nets generalize? 62. Role of implicit regularization? 63. Compression view of generalization? 64. Stability and generalization link? 65. Information bottleneck theory? --- Graphs & Advanced Topics: 66. GNN oversmoothing issue? 67. Message passing limits? 68. Spectral vs spatial GNNs? 69. Why graph isomorphism hard? 70. Expressivity of GNNs? --- Causality: 71. Correlation vs causation formal? 72. Backdoor criterion? 73. Instrumental variables? 74. Counterfactual reasoning? 75. Simpson’s paradox explanation? --- Practical ML: 76. Handling class imbalance? 77. Data leakage examples? 78. Feature engineering impact? 79. When to use cross-validation? 80. Hyperparameter tuning strategies? --- Systems & Scaling: 81. Distributed training challenges? 82. Data parallel vs model parallel? 83. Memory bottlenecks? 84. Mixed precision benefits? 85. Latency vs throughput tradeoff? --- Robustness & Ethics: 86. Adversarial examples why? 87. Robust training methods? 88. Fairness in ML? 89. Bias detection? 90. Privacy in ML (DP)? --- Cutting Edge: 91. Diffusion models vs GANs? 92. Why diffusion stable? 93. RLHF pipeline? 94. Alignment problem? 95. Multimodal learning challenges? 96. Foundation models risks? 97. In-context learning theory? 98. Test-time compute scaling? 99. Mechanistic interpretability? 100. Future of ML Theory?
English
4
25
194
12K
KDXB2000 retweetledi
Mathelirium
Mathelirium@mathelirium·
In 1998, Japanese Mathematician, Engineer and Machine Learning pioneer Shun-ichi Amari published "Natural Gradient Works Efficiently in Learning (722 Cites in Papers and 9 Cites in Patents)" and made a point that still feels fresh even today If your loss is L(θ), then the usual update θ̇ = −∇ L(θ) quietly assumes your parameter space is flat. Amari asked what happens when the model itself has geometry, described by G(θ), the Fisher information. Then the more natural direction becomes θ̇ = −G(θ)⁻¹∇ L(θ). This is the same loss, but with a different notion of distance. As the animation shows, ordinary Gradient Descent follows the raw slope and gets dragged around by distortion, while the natural gradient moves in a way that respects the geometry of the statistical model itself.
English
7
65
426
26.7K
Clem Fandango
Clem Fandango@LondonToastof·
@kuiperDXB @Gaurab I should’ve said life as we know it and also that water is a molecule, not an element, but the broader point stands.
English
1
0
0
51
KDXB2000 retweetledi
Gaurab Chakrabarti
The internet runs on a coincidence of atomic physics. Erbium emits light at exactly 1,550 nanometers. Silica glass fiber loses the least signal at exactly 1,550 nanometers. One is a quantum property of a rare earth element, the other is an optical property of melted sand. They have nothing to do with each other. It is pure luck. Before erbium-doped fiber amplifiers, every undersea signal had to be converted from light to electricity and back every 50 kilometers. Each conversion degraded the signal and capped bandwidth. Erbium removed that cap. An erbium amplifier sitting on the floor of the ocean boosts signals 1,000 times and runs for decades without maintenance. 99% of intercontinental data moves through glass strands no thicker than a human hair, amplified by a rare earth element that just so happens to emit at the right wavelength. And erbium isn't even the strangest one.
English
83
514
4.4K
187.6K
Silicon Sorcerer
Silicon Sorcerer@hackradios·
Okay but if the absorption minimum in telecom glass fiber occurred at a different wavelength then we would have used a different atom from Erbium to dope the amplifiers (and these also currently exist). "Exactly" 1550nm is also an overstatement. Typical EDFAs have 30-40nm of gain-bandwidth that roughly overlaps with the minimum-loss point in telecom fibers.
English
1
0
71
4.1K
Wealth Chakra
Wealth Chakra@WealthChakraa·
@Gaurab imagine telling someone from 200 years ago that the entire internet depends on a weird element sitting on the ocean floor doing exactly the right thing they'd think you were making up religion turns out we just don't know the rules yet
English
2
0
13
2.7K
KDXB2000
KDXB2000@kuiperDXB·
@LondonToastof @Gaurab No ice being denser than liquid water is not the reason life exists (it’s a consequence of the property). It’s Hydrogen bonds that water forms ( great solvent, medium for carbon/organic chemistry to life, high heat conduction etc.)
English
1
0
3
56
Clem Fandango
Clem Fandango@LondonToastof·
@Gaurab Life only exists because H2O is one of the few elements in the solid state that is less dense than in the liquid state, so it floats. Most other elements in the liquid state are less dense.
English
5
0
63
4.4K