M1NDB3ND3R

698 posts

M1NDB3ND3R

@mindbender08

20 y/o | Linux Enthusiast | Python & C Coder | Exploring Cybersecurity , AI and Blockchain

Katılım Temmuz 2022

484 Takip Edilen197 Takipçiler

M1NDB3ND3R@mindbender08·5h

@charmcli Does loading on demand gives any advantages of is it similar to defer loading

English

Charm@charmcli·7h

MCPs, without the config. Crush now loads them on demand with Docker.

English

8.4K

M1NDB3ND3R@mindbender08·1d

@thdxr I don't know why do we need ai in password manager 😔

English

153

dax@thdxr·1d

ai password manager is that anything?

English

146

379

56.9K

M1NDB3ND3R retweetledi

Avi Chawla@_avichawla·4d

Big release from Kimi! They just released a new way to handle residual connections in Transformers. In a standard Transformer, every sub-layer (attention or MLP) computes an output and adds it back to the input via a residual connection. If you consider this across 40+ layers, the hidden state at any layer is just the equal-weighted sum of all previous layer outputs. Every layer contributes with weight=1, so every layer gets equal importance. This creates a problem called PreNorm dilution, where as the hidden state accumulates layer after layer, its magnitude grows linearly with depth. And any new layer's contribution gets progressively buried in the already-massive residual. This means deeper layers are then forced to produce increasingly large outputs just to have any influence, which destabilizes training. Here's what the Kimi team observed and did: RNNs compress all prior token information into a single state across time, leading to problems with handling long-range dependencies. And residual connections compress all prior layer information into a single state across depth. Transformers solved the first problem by replacing recurrence with attention. This was applied along the sequence dimension. Now they introduced Attention Residuals, which applies a similar idea to depth. Instead of adding all previous layer outputs with a fixed weight of 1, each layer now uses softmax attention to selectively decide how much weight each previous layer's output should receive. So each layer gets a single learned query vector, and it attends over all previous layer outputs to compute a weighted combination. The weights are input-dependent, so different tokens can retrieve different layer representations based on what's actually useful. This is Full Attention Residuals (shown in the second diagram below). But here's the practical problem with this idea. Full AttnRes requires keeping all layer outputs in memory and communicating them across pipeline stages during distributed training. To solve this, they introduce Block Attention Residuals (shown in the third diagram below). The idea is to group consecutive layers into roughly 8 blocks. Within each block, layer outputs are summed via standard residuals. But across blocks, the attention mechanism selectively combines block-level representations. This drops memory from O(Ld) to O(Nd), where N is the number of blocks. Layers within the current block can also attend to the partial sum of what's been computed so far inside that block, so local information flow isn't lost. And the raw token embedding is always available as a separate source, which means any layer in the network can selectively reach back to the original input. Results from the paper: - Block AttnRes matches the loss of a baseline LLM trained with 1.25x more compute. - Inference latency overhead is less than 2%, making it a practical drop-in replacement - On a 48B parameter Kimi Linear model (3B activated) trained on 1.4T tokens, it improved every benchmark they tested: GPQA-Diamond +7.5, Math +3.6, HumanEval +3.1, MMLU +1.1 The residual connection has mostly been unchanged since ResNet in 2015. This might be the first modification that's both theoretically motivated and practically deployable at scale with negligible overhead. More details in the post below by Kimi👇 ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

221

2.3K

343.2K

M1NDB3ND3R@mindbender08·6d

@Dhanush_Nehru People now will curse ai i guess for the his friend who uses ai and saved his job

English

314

Dhanush N@Dhanush_Nehru·6d

Layoffs Layoffs Layoffs Everywhere

English

3.3K

M1NDB3ND3R@mindbender08·13 Mar

@Dhanush_Nehru But agent are using a method called defer_load to lazy search tools and send to model so the context is not filled

English

Dhanush N@Dhanush_Nehru·13 Mar

MCP was supposed to be the universal plug for AI tools. Connect once, use everywhere. But in reality: 🔴 Every tool you add fills up the AI's "thinking space" with instructions 🔴 The more tools, the less room to actually do the work 🔴 Auth between services is a mess So teams are ditching the fancy protocol and going back to direct API calls and CLIs. Simpler. Faster. More reliable. The best technology is often the one that gets out of the way.

Morgan@morganlinton

The cofounder and CTO of Perplexity, @denisyarats just said internally at Perplexity they’re moving away from MCPs and instead using APIs and CLIs 👀

English

244

M1NDB3ND3R@mindbender08·11 Mar

@vivoplt @droid and @opencode

Vivo@vivoplt·10 Mar

be honest, which AI tool is best for coding?

English

522

2.4K

360.8K

M1NDB3ND3R@mindbender08·10 Mar

@yigitkonur @FactoryAI its only for me our do you face this issue that disk usage of @droid is high ??

English

102

Yigit Konur@yigitkonur·9 Mar

the new “mission” (preview) feature in @FactoryAI is really interesting (aka “droid” on the CLI). if you’re into one‑shotting projects, you should definitely check it out. right now i have opus + gpt‑5.4 collaborating as orchestrator/worker/validator agents, all working together to refactor a typescript project. it’s been running for 6+ hours. really curious why it takes that long and burns 30M+ tokens. hoping the results will amaze me, because i already spent all my credits in the first hour. now i’m using my codex sub and keeping the droid subs as the orchestrator only. will update this tweet with the results!

English

6.5K

M1NDB3ND3R@mindbender08·9 Mar

@ZohoWorkplace @moulidorai It's one of the best password manager which offer enterprise features for free and I don't why people aren't using it 😭

English

Zoho Workplace@ZohoWorkplace·9 Mar

Is your password manager asking for a raise? That’s your cue to switch teams 😉 Try Zoho Vault, included at no extra cost with your Zoho Workplace suite. 🚀 Secure and share your team's passwords, cards, and other confidential information with confidence. 🔐 Try Zoho Vault today 👉🏻 zurl.co/yWNZ7

English

329

M1NDB3ND3R@mindbender08·27 Şub

@KalGrinberg is droid better than claude or opencode , I see in terminal bench its better but i see posts only on these two so i am confused

English

Kal Grinberg@KalGrinberg·26 Şub

New UI design for Mission Control One screen to watch Droids build for days on end

Factory@FactoryAI

Droids can now pursue goals autonomously over multi-day horizons. You describe what you want, approve the plan, and come back to finished work. We call these Missions.

English

214

20.9K

M1NDB3ND3R@mindbender08·27 Şub

@Dhanush_Nehru Zig : may I come in

English

Dhanush N@Dhanush_Nehru·26 Şub

Let settle this. Golang or Rust?

English

1.1K

M1NDB3ND3R@mindbender08·27 Şub

@Dhanush_Nehru Build a 100 million dollar company

English

Dhanush N@Dhanush_Nehru·26 Şub

what next?

English

238

M1NDB3ND3R@mindbender08·25 Şub

@zack0x01 its going to public or private if public willing to contribute

English

M1NDB3ND3R@mindbender08·25 Şub

@zack0x01 is it from scratch or using existing coding agents and pilling it up with mcp servers , custom tools calling , skills and sub-agents and etc...

English

370

M1NDB3ND3R@mindbender08·25 Şub

@inkdrop_app I just converted it to inr ₹ 45,45,37,50,500.00😅 Hope someday i could afford it

English

Takuya 🐾 devaslife@inkdrop_app·25 Şub

@mindbender08 it would be $500000000

English

Takuya 🐾 devaslife@inkdrop_app·24 Şub

Recent warm reviews of Inkdrop❤️

English

3.4K

M1NDB3ND3R@mindbender08·24 Şub

@zeddotdev its wednesday in india right now.

English

Zed@zeddotdev·23 Şub

Coming this Wednesday...

English

400

40.2K

M1NDB3ND3R@mindbender08·24 Şub

@inkdrop_app if @inkdrop_app has a lifetime plan what it would be ?

English

Takuya 🐾 devaslife@inkdrop_app·24 Şub

Try it inkdrop.app

English

1.6K

M1NDB3ND3R@mindbender08·24 Şub

@Dhanush_Nehru its going to be greatest if concentrated on physics , medical science , quantum computing and try to solve meaning full solutions for real-world problems . Technologies are evolving day2day but we are not integrating to solve problems

English

Dhanush N@Dhanush_Nehru·24 Şub

The way AI is evolving it will either be the greatest invention humanity ever built or the last mistake we ever make.

English

276

M1NDB3ND3R@mindbender08·24 Şub

@TCMSecurity Sure, saved will wait for internship

English

TCM Security@TCMSecurity·24 Şub

@mindbender08 Bookmark the link to keep it on hand - and we will share when we open internships in the future!

English

TCM Security@TCMSecurity·22 Şub

Searching for a new technical or instructional role this Sunday? We've got plenty of open roles over here at TCM Security! Take a look through them today: hubs.la/Q0444xTX0 #hiring #cybersecurity