Drake

13.5K posts

Drake

@Drake_Dictator

Tech enthusiast, NLP , web3 🙆‍♂️📈🦅🤖

Katılım Temmuz 2023

691 Takip Edilen196 Takipçiler

Drake retweetledi

Sundar Pichai@sundarpichai·2d

Just off stage at #GoogleIO, some highlights from this morning 🧵 Gemini 3.5 Flash is available today for everyone in @antigravity and across our products and APIs. Compared to 3.1 Pro, 3.5 Flash is better across almost all benchmarks with huge progress in coding. It’s also comparable to the best models but very fast (4x faster tokens/ second than other frontier models). And when looking at the intelligence versus output speed, it’s in a league of its own in the top right quadrant.

English

269

452

320.3K

Drake retweetledi

Andrej Karpathy@karpathy·2d

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

English

7.8K

11.1K

147.6K

26.5M

Drake retweetledi

Anuradha Tiwari@talk2anuradha·13 May

मैंने कभी सोचा भी नहीं था कि मेरे खिलाफ इतना बड़ा षड्यंत्र रचा जाएगा। > सबसे पहले मेरा फ़र्ज़ी ट्वीट का स्क्रीनशॉट बनाया जाता है। > फिर उसी फ़र्ज़ी स्क्रीनशॉट को सोशल मीडिया पर फैलाया जाता है। > मैंने उस फ़र्ज़ी स्क्रीनशॉट के खिलाफ साइबर शिकायत दर्ज कराई। > लेकिन उस पर कोई कार्रवाई नहीं हुई, कोई मदद नहीं मिली। > उस फ़र्ज़ी स्क्रीनशॉट को आधार बनाकर @NCSC_GoI स्वतः संज्ञान लेता है। > इसके बाद दिल्ली पुलिस SC/ST एक्ट का नोटिस भेजती है। > मामले में नॉन-बेलेबल धाराएँ जोड़ दी जाती हैं। > और अंत में Twitter से मेरी सारी निजी जानकारी माँगी जाती है। > अब अगला कदम शायद मेरी गिरफ्तारी ही होगा। @DelhiPolice & @NCSC_GoI - दोनों ही BJP सरकार के अंतर्गत आते हैं। यह एक सोची-समझी साज़िश है, जिसका उद्देश्य मुझे जेल भेजना है - ताकि आने वाले समय में कोई भी सवर्ण समाज की आवाज़ उठाने का साहस न कर सके। #SCSTActMisuse

हिन्दी

546

5.9K

13.4K

115.4K

Drake retweetledi

Anand Ranganathan@ARanganathan72·14 May

Shocking. @talk2anuradha has received legal notice from Delhi Police for her three year-old sarcastic tweets after Scheduled Castes Commission took suo motu cognisance of them. The draconian SC/ST Act has been invoked. She could be arrested any moment. I stand with Anuradha.

English

630

7.8K

22.6K

302.8K

Drake retweetledi

News from Google@NewsFromGoogle·12 May

The Google Threat Intelligence Group has detected the first known instance of a threat actor using an AI-developed zero-day exploit in the wild. While the attackers planned a wide-scale strike, our proactive counter-discovery may have prevented that from happening. This finding is part of our new report on AI-powered threats.

English

308

1.7K

13.9K

5.1M

Drake retweetledi

Anuradha Tiwari@talk2anuradha·24 Nis

Look at the admission criteria for Indian Institute of Science (IISc), one of the best Research institutes in India. Rank- 119 Result- Rejected Rank- 3000 Result- Selected

English

186

1.8K

3.6K

41.7K

Drake retweetledi

Google Gemma@googlegemma·21 Nis

What does it take to run 3, 5, or even 10 concurrent instances of Gemma 4 locally? We've open-sourced a demo letting you run multiple models side-by-side on your hardware. Gemma 4 26B A4B easily runs 10+ concurrent requests on a MacBook Pro M4 Max at 18 tokens/sec per request.

English

100

429

5.1K

910.4K

Drake retweetledi

Kimi.ai@Kimi_Moonshot·20 Nis

Meet Kimi K2.6: Advancing Open-Source Coding 🔹Open-source SOTA on HLE w/ tools (54.0), SWE-Bench Pro (58.6), SWE-bench Multilingual (76.7), BrowseComp (83.2), Toolathlon (50.0), Charxiv w/ python(86.7), Math Vision w/ python (93.2) What's new: 🔹Long-horizon coding - 4,000+ tool calls, over 12 hours of continuous execution, with generalization across languages (Rust, Go, Python) and tasks (frontend, devops, perf optimization). 🔹Motion-rich frontend - Videos in hero sections, WebGL shaders, GSAP + Framer Motion, Three.js 3D. 🔹Agent Swarms, elevated - 300 parallel sub-agents × 4,000 steps per run (up from K2.5's 100 / 1,500). One prompt, 100+ files. 🔹Proactive Agents - K2.6 model powers OpenClaw, Hermes Agent, etc for 24/7 autonomous ops. 🔹Claw Groups (research preview) - bring your own agents, command your friends', bots & humans in the loop. - K2.6 is now live on kimi.com in chat mode and agent mode. For production-grade coding, pair K2.6 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blog/kimi-k2-6 🔗 Weights & code: huggingface.co/moonshotai/Kim…

English

933

2.4K

18.2K

7.5M

Drake retweetledi

Claude@claudeai·16 Nis

Introducing Claude Opus 4.7, our most capable Opus model yet. It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back. You can hand off your hardest work with less supervision.

English

4.7K

10.2K

81.1K

13.9M

Drake@Drake_Dictator·7 Nis

@priyankac19 Well said

English

Priyanka Chaturvedi🇮🇳@priyankac19·7 Nis

I am glad I triggered a perpetually outraged movement. Any side which is discriminated against I will speak for it. Whether Muslims, SC, ST, EWS or in this case the Brahmins. And maybe while we are at outrage - time to check whether the reservations that started with the noble intent of representation have benefited all those in reserve categories equally or have been monopolised/taken benefit of by certain communities more than those who deserve it. And yes, I will ask tough questions- Deal with it.

Priyanka Chaturvedi🇮🇳@priyankac19

Coming soon in other states too. Bajao thali aur taali.

English

137

1.2K

4.1K

102K

Drake retweetledi

Demis Hassabis@demishassabis·2 Nis

Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building!

English

326

883

988.8K

Drake retweetledi

Google Research@GoogleResearch·24 Mar

Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: goo.gle/4bsq2qI

GIF

English

5.8K

39K

19.3M

Drake retweetledi

Avi Chawla@_avichawla·16 Mar

Big release from Kimi! They just released a new way to handle residual connections in Transformers. In a standard Transformer, every sub-layer (attention or MLP) computes an output and adds it back to the input via a residual connection. If you consider this across 40+ layers, the hidden state at any layer is just the equal-weighted sum of all previous layer outputs. Every layer contributes with weight=1, so every layer gets equal importance. This creates a problem called PreNorm dilution, where as the hidden state accumulates layer after layer, its magnitude grows linearly with depth. And any new layer's contribution gets progressively buried in the already-massive residual. This means deeper layers are then forced to produce increasingly large outputs just to have any influence, which destabilizes training. Here's what the Kimi team observed and did: RNNs compress all prior token information into a single state across time, leading to problems with handling long-range dependencies. And residual connections compress all prior layer information into a single state across depth. Transformers solved the first problem by replacing recurrence with attention. This was applied along the sequence dimension. Now they introduced Attention Residuals, which applies a similar idea to depth. Instead of adding all previous layer outputs with a fixed weight of 1, each layer now uses softmax attention to selectively decide how much weight each previous layer's output should receive. So each layer gets a single learned query vector, and it attends over all previous layer outputs to compute a weighted combination. The weights are input-dependent, so different tokens can retrieve different layer representations based on what's actually useful. This is Full Attention Residuals (shown in the second diagram below). But here's the practical problem with this idea. Full AttnRes requires keeping all layer outputs in memory and communicating them across pipeline stages during distributed training. To solve this, they introduce Block Attention Residuals (shown in the third diagram below). The idea is to group consecutive layers into roughly 8 blocks. Within each block, layer outputs are summed via standard residuals. But across blocks, the attention mechanism selectively combines block-level representations. This drops memory from O(Ld) to O(Nd), where N is the number of blocks. Layers within the current block can also attend to the partial sum of what's been computed so far inside that block, so local information flow isn't lost. And the raw token embedding is always available as a separate source, which means any layer in the network can selectively reach back to the original input. Results from the paper: - Block AttnRes matches the loss of a baseline LLM trained with 1.25x more compute. - Inference latency overhead is less than 2%, making it a practical drop-in replacement - On a 48B parameter Kimi Linear model (3B activated) trained on 1.4T tokens, it improved every benchmark they tested: GPQA-Diamond +7.5, Math +3.6, HumanEval +3.1, MMLU +1.1 The residual connection has mostly been unchanged since ResNet in 2015. This might be the first modification that's both theoretically motivated and practically deployable at scale with negligible overhead. More details in the post below by Kimi👇 ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

213

2.3K

350.9K

Drake retweetledi

Kimi.ai@Kimi_Moonshot·16 Mar

English

336

2.1K

13.5K

Drake retweetledi

Mrinal@Hi_Mrinal·13 Mar

This was barely a 15 min read ... CRAZYY VISUALS and I know how much effort is needed to make such visuals from scratch .... thanks for read :)

Avi Chawla@_avichawla

x.com/i/article/2031…

English

678

87K

Drake retweetledi

Harveen Singh Chadha@HarveenChadha·11 Mar

Anyone who is interested in working at a frontier lab must read this tech report from Nvidia The data engineering section is amazing and look at the amount of different models they used for synthetic data gen research.nvidia.com/labs/nemotron/…

English

175

1.8K

109K

Drake retweetledi

Claude@claudeai·9 Mar

Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.