0xGerbot (@gerbot_) - Twitter-Profil | Zamantika Mersobahis Locabet

0xGerbot retweetet

DiscussingFilm@DiscussingFilm·1d

50 years ago today, this first-ever shot for ‘STAR WARS’ was captured.

English

152

3.7K

54.7K

796.8K

0xGerbot retweetet

Ethan Mollick@emollick·2d

Its not Simcity, but business school students who were good at Civ V also turn out to be better planners, organizers, and problem-solvers in this small experiment.

exQUIZitely 🕹️@exQUIZitely

Job interview: "Any management experience?" Me:

English

176

1.4K

13.3K

1.3M

0xGerbot retweetet

Alastair Brookshaw@albrookshawAFC·1d

Telegraph Football@TeleFootball

✍️ "It has been a strange journey for Arsenal supporters to go from fans of a venerable old club to a bunch of insufferable pearl-clutchers," writes George Chesterton. Read why George, as a lapsed Arsenal supporter, finds Arsenal fans insufferable ⤵️ telegraph.co.uk/football/2026/…

ZXX

49

2.4K

13.6K

623K

0xGerbot@gerbot_·2d

Thanks @Afrihost for nothing ✌️ always "it's not us, it's @vumatel " Why do we even have these ISPs if they don't own any of the fibre infra? Why can't we just deal with Vuma ourselves and stop this "middleman" businesses...

English

1

0

92

0xGerbot@gerbot_·2d

Type of shit you'd see in Night in the museum movies

Science girl@sciencegirl

South Korea's giant dachshund sculpture, nicknamed Sunshine, is a hyperrealistic, ‘breathing’ art installation

English

0

6

0xGerbot@gerbot_·3d

A drop of this is enough to poison the water supply of a city

maica@maicasyaa

Hey, do you like extra spicy noodles?

English

0

1

17

0xGerbot retweetet

Michael Mando@MandoMichael·4d

Here's to all u beautiful people who have been down since Vaas. Thank u 💚🦂

DiscussingFilm@DiscussingFilm

First look at Michael Mando as Scorpion in ‘SPIDER-MAN: BRAND NEW DAY’.

English

2.5K

18.7K

317.1K

6.4M

0xGerbot retweetet

Paris Saint-Germain@PSGbrasil·6d

don't tap 👀

English

5.6K

32K

261.6K

7.7M

0xGerbot retweetet

★@sivvlp·5d

This angle when the players celebrate and the flag appearing ..

P@perwilo

English

17

720

13.2K

213.2K

0xGerbot retweetet

Minga@KillaMinga·5d

this mf spilling everything 😭

English

20

87

1.3K

508.4K

0xGerbot retweetet

Shitpost 2049@shitpost_2049·17 Mar

ZXX

14

908

9.9K

317.7K

0xGerbot retweetet

The Mighty Mk1 Ford Transit@ShatteredSFW·6d

"Oh no! There are 2 jedi heading for the bridge! Whatever will we do?" The humble droideka:

Kristi Yamaguccimane@TheWapplehouse

English

76

6.9K

83.5K

2M

0xGerbot retweetet

talkSPORT@talkSPORT·17 Mar

👋 "Slapped by PSG. Slapped by Newcastle. Slapping week!" Jamie O'Hara couldn't wait to get his revenge on Jason Cundy after Chelsea's nightmare week 🤣

English

76

198

3K

270.9K

0xGerbot retweetet

Dhruv@haildhruv·16 Mar

found a website where you can create, program and test electronic hardware. it already has some featured projects really great if you want to test before building your own hardware

English

198

2.3K

21.1K

1.1M

0xGerbot retweetet

Charles Watts@charles_watts·17 Mar

Poor old Jon Obi-Mikel Clearly he's not going to be able to recognise those trophies he won with Chelsea now because of actual cheating. Shame.

Dale Johnson@DaleJohnsonBBC

How Chelsea signed a star team with hidden payments 🔺 £47.5m paid to 12 individuals or corporate entities 🔺 Involved deals for Hazard, Luiz, Matic, Ramires, Willian 🔺 Why have some names been redacted? 🔺 Why did Chelsea escape stronger punishment? bbc.co.uk/sport/football…

English

137

2.5K

14.4K

782.4K

0xGerbot@gerbot_·6d

How Redditors are born

Massimo@Rainmaker1973

A nanobot picks up a lazy sperm by the tail and inseminates a waiting egg in it

English

0

9

0xGerbot retweetet

sui ☄️@birdabo·16 Mar

This might be the most important architecture paper this year. neural net layers can now selectively grab useful info from previous layers instead of blindly adding everything. > 25% faster training. > better reasoning. > open source. Kimi is the new DeepSeek.

Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

17

24

384

26.2K

0xGerbot retweetet

Avi Chawla@_avichawla·16 Mar

Big release from Kimi! They just released a new way to handle residual connections in Transformers. In a standard Transformer, every sub-layer (attention or MLP) computes an output and adds it back to the input via a residual connection. If you consider this across 40+ layers, the hidden state at any layer is just the equal-weighted sum of all previous layer outputs. Every layer contributes with weight=1, so every layer gets equal importance. This creates a problem called PreNorm dilution, where as the hidden state accumulates layer after layer, its magnitude grows linearly with depth. And any new layer's contribution gets progressively buried in the already-massive residual. This means deeper layers are then forced to produce increasingly large outputs just to have any influence, which destabilizes training. Here's what the Kimi team observed and did: RNNs compress all prior token information into a single state across time, leading to problems with handling long-range dependencies. And residual connections compress all prior layer information into a single state across depth. Transformers solved the first problem by replacing recurrence with attention. This was applied along the sequence dimension. Now they introduced Attention Residuals, which applies a similar idea to depth. Instead of adding all previous layer outputs with a fixed weight of 1, each layer now uses softmax attention to selectively decide how much weight each previous layer's output should receive. So each layer gets a single learned query vector, and it attends over all previous layer outputs to compute a weighted combination. The weights are input-dependent, so different tokens can retrieve different layer representations based on what's actually useful. This is Full Attention Residuals (shown in the second diagram below). But here's the practical problem with this idea. Full AttnRes requires keeping all layer outputs in memory and communicating them across pipeline stages during distributed training. To solve this, they introduce Block Attention Residuals (shown in the third diagram below). The idea is to group consecutive layers into roughly 8 blocks. Within each block, layer outputs are summed via standard residuals. But across blocks, the attention mechanism selectively combines block-level representations. This drops memory from O(Ld) to O(Nd), where N is the number of blocks. Layers within the current block can also attend to the partial sum of what's been computed so far inside that block, so local information flow isn't lost. And the raw token embedding is always available as a separate source, which means any layer in the network can selectively reach back to the original input. Results from the paper: - Block AttnRes matches the loss of a baseline LLM trained with 1.25x more compute. - Inference latency overhead is less than 2%, making it a practical drop-in replacement - On a 48B parameter Kimi Linear model (3B activated) trained on 1.4T tokens, it improved every benchmark they tested: GPQA-Diamond +7.5, Math +3.6, HumanEval +3.1, MMLU +1.1 The residual connection has mostly been unchanged since ResNet in 2015. This might be the first modification that's both theoretically motivated and practically deployable at scale with negligible overhead. More details in the post below by Kimi👇 ____ Find me → @_avichawla Every day, I share tutorials and insights on DS, ML, LLMs, and RAGs.

Kimi.ai@Kimi_Moonshot

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

78

220

2.3K

346.2K

0xGerbot retweetet

𝚃𝙷𝙴 𝚆𝙷𝙸𝚃𝙴 𝚁𝙰𝙱𝙱𝙸𝚃@White_Rabbit_OG·15 Mar

😫😫😫

R A W S A L E R T S@rawsalerts

🚨#BREAKING: Taiwanese officials say a large number of Chinese military aircraft have been detected operating near the island,

QME

120

2.6K

31.9K

1.5M

0xGerbot retweetet

Aakash Gupta@aakashgupta·16 Mar

50% of all relationship advice on Reddit is “leave.” 15 years of data, 52 million comments, and the trend line only goes one direction. A researcher filtered r/relationship_advice down to 1,166,592 quality comments and tracked what people actually recommend. In 2010, “End Relationship” sat around 30%. By 2025, it’s approaching 50%. “Communicate” dropped from 22% to 14%. “Compromise” collapsed from 7% to 3%. “Give Space” fell from 25% to 13%. Every category that requires patience lost ground every single year. The one category growing faster than “leave” is “Seek Therapy,” which went from 1% to 6%. The subreddit is slowly learning to say “this is above my pay grade.” Train a model on this dataset and it would absolutely tell people to break up. The training data is 50% “leave” and climbing. The model wouldn’t be broken. It would be accurately reflecting what 52 million commenters actually believe about your relationship. A 50% prior that you should leave, a 14% prior that you should talk about it, and a 6% prior that you need a professional. That’s not LLM psychosis. That’s the median human opinion on your relationship, backed by the largest advice dataset ever assembled.

“paula”@paularambles

LLM that keeps telling people to break up because it’s been trained on relationship advice subreddits

English

507

2.1K

16.7K

2.1M

0xGerbot

Entdecken