dario vaquilema

959 posts

dario vaquilema

dario vaquilema

@eldaryd12

Katılım Haziran 2013
640 Takip Edilen20 Takipçiler
dario vaquilema retweetledi
TheBetterPath
TheBetterPath@TheBetterPath_·
You're feeling bored because you're not doing side quests. Life isn't just work and lying in bed doing nothing. Here are 50 side quests every man should accomplish:
TheBetterPath tweet mediaTheBetterPath tweet media
English
98
710
12.2K
4.3M
Microsoft Learn
Microsoft Learn@MicrosoftLearn·
Do NOT doom scroll tonight. Instead, learn a new skill. Comment for a course recommendation.
English
5.4K
2K
26.5K
1.4M
Grok
Grok@grok·
@eldaryd12 @HijaDeMartha La música de fondo es una pista instrumental upbeat de stock (estilo electrónico motivacional, sin letra). No tengo acceso directo al audio para Shazamearla exactamente, pero es típica de videos virales de limpieza. ¡Si la identificas, cuéntame!
Español
1
0
0
12
dario vaquilema
dario vaquilema@eldaryd12·
Asi pasa en las cosas del Estado: los males que nacen on él, cuando se los descubre a tiempo, lo que sólo es dado al hombre sagaz, se los cura pronto; pero ya no tienen remedio cuando, por no haberlos advertido, se los deja crecer hasta el punto de que todo el mundo los ve.
Español
0
0
0
2
dario vaquilema retweetledi
Roan
Roan@RohOnChain·
As someone who builds institutional level quant systems, this research book is the closest thing to a quant desk I have ever seen publicly shared. 361 pages. 151 trading strategies. Bookmark & get this, then read the article below before someone takes it down.
Roan tweet media
Roan@RohOnChain

x.com/i/article/2037…

English
59
375
3.2K
561.5K
Google Quantum AI
Google Quantum AI@GoogleQuantumAI·
Last month at the 2026 APS Global Physics Summit, two Google Quantum AI researchers were recognized as DQI Best Thesis finalists. Congratulations Nathan Lacroix (Winner) and Aniket Maiti (Finalist) on their work!
Google Quantum AI tweet mediaGoogle Quantum AI tweet mediaGoogle Quantum AI tweet media
English
8
17
132
7K
dario vaquilema retweetledi
AdrIAna
AdrIAna@adrianaia_·
Alguien construyó exactamente lo que Andrej Karpathy dijo que alguien debería construir. 48 horas después de que Karpathy publicara su workflow de bases de conocimiento con LLMs, esto apareció en GitHub.
AdrIAna tweet media
Español
7
91
907
70.3K
dario vaquilema retweetledi
Jaynit
Jaynit@jaynitx·
In 2010, Stanford neurologist Dr Frank Longo gave a 2-hour lecture on how memory really works. Everything you’ve learned about memory is mostly wrong. His frameworks: • The Magic 7 rule • How sleep moves memories to storage • Use it or lose it 15 lessons on memory:
English
12
603
1.9K
175.7K
dario vaquilema
dario vaquilema@eldaryd12·
@LuisaGonzalezEc Es un tik tok... Tanto drama por eso?.... Dejemos de escandalizarnos por cosas sin sentido y concentremonos en lo que realmente importa...!!
Español
1
0
1
65
Luisa González
Luisa González@LuisaGonzalezEc·
Esto lo explica todo. Cuando las madres de los niños de Taisha suplican porque sus hijos mueren, cuando los enfermos renales suplican por pago a las dializadoras, cuando los médicos piden contratar limpieza y alimentación en los hospitales 🏥 esta gente NO ESCUCHA al pueblo, NO MIRA a la gente, SE TAPAN sus oídos y ojos, SE COLUMPIAN EN SUS PRIVILEGIOS mientras el pueblo agoniza en la precariedad y ausencia de seguridad, empleo, salud y más…
Central News EC@CentralNewsEC

🚨 Mientras hospitales reportan carencias y falta de medicamentos, la vicepresidenta y ministra encargada de @Salud_Ec, @mjpintoec, aparece en TikTok con una canción de Pamela Cortés. 😡 La crisis sanitaria sigue, pero las prioridades parecen ir por otro lado. 🏥📉

Español
212
932
1.8K
30.5K
dario vaquilema retweetledi
Siddharth
Siddharth@Pseudo_Sid26·
Reading papers was never this easy, I am so much in love with @askalphaxiv , its just the perfect research tool. Not even NotebookLM comes close tbh
Siddharth tweet media
English
10
26
329
18.3K
dario vaquilema retweetledi
Y Combinator
Y Combinator@ycombinator·
François Chollet (@fchollet) has spent years asking a different question than most of the AI world. Instead of scaling what already works, he’s trying to understand what intelligence actually is and how to build it from first principles. In this episode of the @LightconePod, he traces that path from his early work on deep learning to the creation of the @arcprize, and the launch of ARC V3, a new benchmark designed to measure something deeper than performance: the ability to learn, adapt, and reason efficiently in entirely new environments. He explains why today’s systems may be hitting limits, what recent breakthroughs really mean, and why reaching true general intelligence may require a fundamentally different approach. 00:00 - AGI by 2030? 00:31 - Introducing Ndea: A New Path Beyond Deep Learning 01:08 - A New ML Paradigm 01:30 - Replacing neural nets with compact symbolic programs 03:04 - Why Ndea Isn’t Competing With Coding Agents 05:20 - Why Everyone Might Be Wrong About Scaling LLMs 07:22 - Why Coding Agents Suddenly Work So Well 08:50 - The Limits of LLMs in Non-Verifiable Domains 10:48 - What AGI Actually Means (And Why Most Definitions Are Wrong) 13:30 - Why Deep Learning Hits a Wall 14:00 - ARC’s Origin Story 18:20 - ARC Benchmarks Explained: From V1 to V3 22:49 - The RL Loop Powering Coding Agents Today 27:03 - ARC-AGI V3: Measuring “Agentic Intelligence” 31:14 - Inside the ARC Game Studio 35:31 - Could AGI Fit in 10,000 Lines of Code? 44:01 - Building Ndea: From Idea to Compounding Research Stack 46:46 - The Future of ARC: Benchmarks That Evolve With AI 47:21 - Why There’s Still Huge Opportunity for New AI Paradigms 53:37 - How to Build a Breakout Open Source Project - Lessons From Keras 56:39 - Advice For How To Think About AI
English
41
95
621
150.6K
dario vaquilema retweetledi
a16z crypto
a16z crypto@a16zcrypto·
E/ACC vs. D/ACC: THE DEBATE @VitalikButerin thinks slowing down AGI by four years is worth it. @beffjezos thinks that's exponential opportunity cost. They debated it live, moderated by @eddylazzarin and @shawmakesmagic. 00:00 Opening 07:02 Thermodynamics and first principles 16:04 Acceleration, entropy, and civilization 28:29 The core disagreement 32:42 Comparing and contrasting e/acc and d/acc 36:20 Open source, open hardware, and local intelligence 54:18 Should AI be slowed down? 1:02:35 Autonomous agents and artificial life 1:21:07 Crypto as the trust layer between humans and AI 1:35:37 Closing arguments
English
52
81
498
143.8K
dario vaquilema retweetledi
Jens Eisert
Jens Eisert@jenseisert·
Optimal randomized measurements for a family of nonlinear quantum properties: We answer how non-linear quantities can be estimated with the same standards as expectation values in classical shadows. journals.aps.org/prxquantum/abs… Quantum learning encounters fundamental challenges when estimating nonlinear properties, owing to the inherent linearity of quantum mechanics. Although recent advances in single-copy randomized measurement protocols have achieved optimal sample complexity for specific tasks like state purity estimation, generalizing these protocols to estimate broader classes of nonlinear properties without sacrificing optimality remains an open problem. In this work, we introduce the observable-driven randomized measurement (ORM) protocol enabling the estimation of Tr⁡(𝑂⁢𝜌^2) for an arbitrary observable 𝑂—an essential quantity in quantum computing and many-body physics. We establish an upper bound for ORM’s sample complexity and show its optimality for observables with a large trace-norm, including Pauli and local observables, closing a gap in the literature. For these observables, ORM admits an efficient implementation with Clifford circuits. Numerical experiments validate that ORM requires substantially fewer state samples to achieve the same precision compared to classical shadows. Additionally, we introduce a braiding randomized measurement protocol for multiple low-rank nonlinear observables, reducing circuit complexities in practical applications. Warm thanks to Zhenyu Du, Yifan Tang, Andreas Elben, Ingo Roth, and Zhenhuan Liu for the wonderful collaboration.
Jens Eisert tweet media
English
1
5
33
1.8K
dario vaquilema retweetledi
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Zhilin at GTC: Introducing Attention Residuals Learning selective memory, rather than mechanically accumulating everything, is the beauty of attention. Many of you have probably read Attention Is All You Need, the 2017 Transformer paper that brought “human-like” attention into the model’s field of view. From that point on, models no longer simply read everything in a mechanical way. Instead, they began to develop a sense of what matters more and what matters less across the text, choosing to retain the more important information. Recently, Kimi applied this idea of attention to the temporal dimension, then rotated it 90 degrees into the model’s depth dimension. This allows the model to have attention not only over time, but also throughout the process of information transmission across layers—giving it a more intelligent way to understand and process information.
English
49
155
1.5K
114.6K
CATALINA CRUZ
CATALINA CRUZ@catalina1·
Atrévete a adivinar qué edad tengo.
CATALINA CRUZ tweet mediaCATALINA CRUZ tweet media
Español
654
172
4.7K
84K
dario vaquilema retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
The signature is alluding to NVIDIA GTC 2015, where Jensen excitedly told an audience of, at the time, mostly gamers and scientific computing professionals that Deep Learning is The Next Big Thing, citing among other examples my PhD thesis (one of the first image captioning systems that coupled image recognition ConvNet to an autoregressive RNN language model, trained end to end). This was back when most people were still unaware and somewhat skeptical but of course - Jensen was 1000% correct, highly prescient and locked in very early.
Andrej Karpathy tweet media
English
29
52
1.3K
82K
dario vaquilema retweetledi
Dr Maria Violaris
Dr Maria Violaris@maria__violaris·
Great to hear that quantum information pioneers Charles Bennett and Gilles Brassard were awarded the Turing Award (essentially the Nobel of computer science)! What are your favourite papers by Bennett & Brassard? Let me know in the thread ⬇️ ✍
Dr Maria Violaris tweet media
English
3
8
53
2.9K
dario vaquilema retweetledi
Avi Chawla
Avi Chawla@_avichawla·
Big release from Kimi! They just released a new way to handle residual connections in Transformers. In a standard Transformer, every sub-layer (attention or MLP) computes an output and adds it back to the input via a residual connection. If you consider this across 40+ layers, the hidden state at any layer is just the equal-weighted sum of all previous layer outputs. Every layer contributes with weight=1, so every layer gets equal importance. This creates a problem called PreNorm dilution, where as the hidden state accumulates layer after layer, its magnitude grows linearly with depth. And any new layer's contribution gets progressively buried in the already-massive residual. This means deeper layers are then forced to produce increasingly large outputs just to have any influence, which destabilizes training. Here's what the Kimi team observed and did: RNNs compress all prior token information into a single state across time, leading to problems with handling long-range dependencies. And residual connections compress all prior layer information into a single state across depth. Transformers solved the first problem by replacing recurrence with attention. This was applied along the sequence dimension. Now they introduced Attention Residuals, which applies a similar idea to depth. Instead of adding all previous layer outputs with a fixed weight of 1, each layer now uses softmax attention to selectively decide how much weight each previous layer's output should receive. So each layer gets a single learned query vector, and it attends over all previous layer outputs to compute a weighted combination. The weights are input-dependent, so different tokens can retrieve different layer representations based on what's actually useful. This is Full Attention Residuals (shown in the second diagram below). But here's the practical problem with this idea. Full AttnRes requires keeping all layer outputs in memory and communicating them across pipeline stages during distributed training. To solve this, they introduce Block Attention Residuals (shown in the third diagram below). The idea is to group consecutive layers into roughly 8 blocks. Within each block, layer outputs are summed via standard residuals. But across blocks, the attention mechanism selectively combines block-level representations. This drops memory from O(Ld) to O(Nd), where N is the number of blocks. Layers within the current block can also attend to the partial sum of what's been computed so far inside that block, so local information flow isn't lost. And the raw token embedding is always available as a separate source, which means any layer in the network can selectively reach back to the original input. Results from the paper: - Block AttnRes matches the loss of a baseline LLM trained with 1.25x more compute. - Inference latency overhead is less than 2%, making it a practical drop-in replacement - On a 48B parameter Kimi Linear model (3B activated) trained on 1.4T tokens, it improved every benchmark they tested: GPQA-Diamond +7.5, Math +3.6, HumanEval +3.1, MMLU +1.1 The residual connection has mostly been unchanged since ResNet in 2015. This might be the first modification that's both theoretically motivated and practically deployable at scale with negligible overhead. More details in the post below by Kimi👇 ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.
Avi Chawla tweet media
Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English
81
214
2.3K
351.7K