Sesem_ Ag

249 posts

Sesem_ Ag

Sesem_ Ag

@pisun282

Katılım Nisan 2024
10 Takip Edilen1 Takipçiler
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Gemini 3.5 Flash ranks #1 on Automation Bench (from Zapier), beating every other frontier model at a much lower cost
Logan Kilpatrick tweet media
English
173
50
1.1K
89.4K
TERMINAL WAR
TERMINAL WAR@TerminalWarGame·
We're testing Terminal War internally. We believe early feedback is valuable but there is much to improve already. Would you be interested in a playtest this simple?
English
123
44
971
120K
Chubby♨️
Chubby♨️@kimmonismus·
I’ve been invited by Google to attend its annual I/O conference as part of the Builders Program, and I’m incredibly excited. It’s my first time at Google, and this time I brought a camera with me to capture the experience and create a recap video afterward. During the event, I’ll be conducting two fascinating interviews with Google employees, focusing on AI. These will be published at a later date. I’ve already met some amazing people from the community. Here’s to two unforgettable days!
Chubby♨️ tweet media
English
67
18
929
69.7K
Sesem_ Ag
Sesem_ Ag@pisun282·
@synthwavedd i can feel "vibecoded" shi just by look. Whats wrong with you, leo?
English
0
0
0
175
leo 🐾
leo 🐾@synthwavedd·
I've been testing Gemini 3.5 Flash for a little while now, and I'm excited to be able to share one of the outputs that most impressed me! This was 0-shot, no harness, with a single sentence prompt. It outperformed all Claude models, Gemini models (by far), and arguably GPT-5.5 🔥 The issue of laziness that has plagued Gemini models forever has mostly been consigned to history.
leo 🐾 tweet media
English
48
25
540
49.4K
casualnpc
casualnpc@acasualnpc·
@OfficialLoganK Holy shit, SOTA Gemini video model. Seedance is thing of the past. This changes everything 💀
English
2
0
7
4.8K
Sesem_ Ag
Sesem_ Ag@pisun282·
@7a7zz @LexnLin Ughhhh uhh but b-but, it will be on the first place of all benchmarks! Ooga booga boga
English
1
0
1
41
7A7z
7A7z@7a7zz·
@LexnLin why so obsessed with gemini mate , they always disappoint 😭
English
4
0
3
447
Chubby♨️
Chubby♨️@kimmonismus·
I love GPT-5.5. It's a workhorse and exactly the model I was hoping for. But the fact that rumors say version 5.6 is already in the starting blocks makes me even more excited! OpenAI is on fire.
English
80
31
1.4K
44.4K
Sesem_ Ag
Sesem_ Ag@pisun282·
@Fabiobuilds @synthwavedd @arena man be real, nobody using flash models or even gemma if they have access to at least 3.1 pro. Not even talking about other models like claude or gpt
English
0
0
0
34
Fabio Builds
Fabio Builds@Fabiobuilds·
@synthwavedd @arena Google is already doing excellent work with small and local models, I think they are not prioritizing right now pro models. Theu seem to have left the game all to OpenAI and Anthropic for that (which are doing quite bad in small / local models)
English
1
0
0
1.4K
leo 🐾
leo 🐾@synthwavedd·
so apparently gemini 3.2 pro is being tested under "gemini-3.1-pro" on @arena's Code Arena (they have done this kind of stealth testing before) ...and if this is really 3.2 pro, it's not looking good. somehow they gpt-ified frontend? hopefully this is an arena-specific quirk
English
28
12
471
40K
ℏεsam
ℏεsam@Hesamation·
> 12M context window (read it again) > 52x faster than FlashAttention > beats Opus 4.6 on SWE-Bench > 5% the cost of Opus BUT WAIT A MINUTE: > technical blog not technical > access coming soon > paper coming soon > ““Built by researchers from Meta, Google, Oxford, Cambridge, BYU” doesn’t name a single one of them if this is not a scam, or the numbers aren’t dishonest, it’s disgustingly promotional.
ℏεsam tweet media
Alexander Whedon@alex_whedon

Introducing SubQ - a major breakthrough in LLM intelligence. It is the first model built on a fully sub-quadratic sparse-attention architecture (SSA), And the first frontier model with a 12 million token context window which is: - 52x faster than FlashAttention at 1MM tokens - Less than 5% the cost of Opus Transformer-based LLMs waste compute by processing every possible relationship between words (standard attention). Only a small fraction actually matter. @subquadratic finds and focuses only on the ones that do. That's nearly 1,000x less compute and a new way for LLMs to scale.

English
55
47
1.3K
123.4K
Escape from Tarkov: Arena
Escape from Tarkov: Arena@tarkovarena·
Gladiators, we invite you to test your knowledge of weapon mods in #TarkovArena. Can you identify all the attachments? Share your answers in the comments!
Escape from Tarkov: Arena tweet mediaEscape from Tarkov: Arena tweet media
English
22
24
528
78.9K
Flowers ☾
Flowers ☾@flowersslop·
Gameplay screenshot of GTA but it is in the Harry Potter world, Hogwarts in the background, wand equipped
Flowers ☾ tweet media
English
7
4
158
11K
Discord
Discord@discord·
please enjoy
English
4.5K
4.1K
62.6K
14.7M
Sesem_ Ag
Sesem_ Ag@pisun282·
@kimmonismus eh, still no real benefits except for benchshits. Yet another flash level model that loops as fuck
English
0
0
0
52
Chubby♨️
Chubby♨️@kimmonismus·
A 12-month time difference between Gemma 3 27b and Gemma 4 31b. The jump is absolutely enormous. Just look at the evaluations between the two models. GPQA doubled, AIME 2026 went from ~20% to ~90%, and so on. Crazy.
Chubby♨️ tweet media
ollama@ollama

Learn more: ollama.com/library/gemma4

English
36
47
622
45.6K
Sesem_ Ag
Sesem_ Ag@pisun282·
@Klampzy @MaD_MrL Will you watch a movie about the game you like, but every character will look like a shit? No, cuz nobody likes watching a shit, it's just facts
English
0
0
2
51
Sesem_ Ag
Sesem_ Ag@pisun282·
@wholyv smaller model will be more stupid, it literally has less knowledge about everything. Corps trying compensate it with thinking, but, looking at gpshit 5.4, it looks terrible. Just look at claude models.
English
1
0
1
16
lyv ⌘
lyv ⌘@wholyv·
🚨We all thought Apple was leaving behind in the AI race, turned out they had already crossed the finish line, waiting for the world to catch up. entire world is running behind the concept of bigger model -> better but apple took edge compute seriously and started working on it much earlier, even before openclaw became a thing. We all know how apple is so much obsessed with privacy, it’s no wonder they didn’t want an LLM to get access to your private data. so what’s apple’s plan? Apple’s plan is edge ai. it aims to provide small and efficient and cheap models directly installed on your phone which you can access privately, securely, without uploading your data on any LLM for a fraction of regular cost. won’t small models be stupid? quite the opposite actually. Alibaba launched 4 small models and all of them beat opus 4.5 on many benchmarks. the goal here is targeted use of models, not to make them general purpose. regardless, we will soon see something very cool in the space of AI that will completely redefine how we work with it. this time, it will come from apple.
Ejaaz@cryptopunk7213

x.com/i/article/2029…

English
2
0
5
508