SUPSUP

186 posts

SUPSUP banner
SUPSUP

SUPSUP

@berrroo000

maybe not today but tomorrow!

Katılım Mart 2024
55 Takip Edilen6 Takipçiler
SUPSUP
SUPSUP@berrroo000·
@leerob To solve CTF challenges ( the hard ones) and to solve machines challenges like those on hackthebox
English
0
0
0
20
Lee Robinson
Lee Robinson@leerob·
Where could we improve Composer 2.5? We're working on the next model and would love your feedback. Lots of work to do (our CursorBench evals below) in the coming weeks!
Lee Robinson tweet media
English
537
123
2.1K
5.4M
SUPSUP
SUPSUP@berrroo000·
@Zaddyzaddy totally agree, the new gemini just sucks on everything
English
0
0
0
165
SUPSUP
SUPSUP@berrroo000·
I need the old gemini, the new gemini sucks tbh
English
0
0
0
2
BridgeMind
BridgeMind@bridgemindai·
Gemini 3.5 Flash reminds me of GPT 3.5 Turbo. Insanely fast. Complete garbage output. I asked it to build a Flappy Bird clone. Look at the result. This is pure slop. Then you actually use it and the output looks like a 2022 model wrote it. Google benchmaxed this one. Optimized for benchmarks, not for real work. Speed means nothing if everything it ships needs to be rewritten. Claude Opus 4.7 and GPT 5.5 are slower and it doesn't matter because the output actually works.
English
52
8
202
17.1K
Google
Google@Google·
The rumors are true… Today, we’re introducing the Gemini 3.5 model series. #GoogleIO
Google tweet media
English
540
1.3K
16.9K
814.3K
Logan Kilpatrick
Logan Kilpatrick@OfficialLoganK·
Welcome to Gemini 3.5 Flash, our most powerful model to date. It pushes the frontier of intelligence, speed, and cost putting 3.5 Flash in a class of its own. We spent the last 6 months making sure Flash is great for real world use cases. It's available everywhere now!
Logan Kilpatrick tweet media
English
436
715
7.2K
583.7K
SUPSUP retweetledi
BridgeMind
BridgeMind@bridgemindai·
Gemini 3.5 Flash scores 55.1% on SWE-Bench Pro. Claude Opus 4.7 scores 64.3%. Not even close. Google just made a Flash model that beats their own Pro in tool use and agentic tasks. But on real world coding? Still 9 points behind Opus 4.7. GPT 5.5 beats it too at 58.6%. If this is the model Google needed to make a comeback with, it's not there yet on coding. Waiting on Gemini 3.5 Pro. That's where the real test is.
BridgeMind tweet media
English
56
11
256
35K
SUPSUP
SUPSUP@berrroo000·
@GeminiApp if the new gemini model isnt better then Mythos it's garbage
English
0
0
0
630
SUPSUP
SUPSUP@berrroo000·
Arena.ai is the most valid benchmark of all ai models
English
0
0
0
0
Codex Releases
Codex Releases@CodexReleases·
Codex CLI 0.131.0 is out. Highlights: - Python SDK moved to openai-codex / openai_codex, with pinned runtime-generated types, concurrent turn routing, and approval modes - codex doctor added for support-ready diagnostics across runtime, auth, terminal, network, config, and local state - TUI now shows blended token usage, permissions/approval mode, and effective workspace roots; responsive Markdown tables added - @ mentions now search files, directories, plugins, and skills in a unified picker Complete details in thread ↓
Codex Releases tweet media
English
38
62
1.1K
144.7K
SUPSUP retweetledi
BridgeMind
BridgeMind@bridgemindai·
Gemini CLI with Gemini 3.1 Pro scores 43 on the Coding Agent Index. Dead last. 18 points behind the leader. Google I/O is tomorrow. Gemini 3.2 and Gemini 3.5 are both expected to drop. These models need to be significantly better. Google has the intelligence. The model benchmarks prove it. But the tooling and harness are killing them. Every other lab has a working coding CLI. Google's is last place by a mile. Tomorrow is make or break. I'm testing both models the second they drop.
BridgeMind tweet media
English
22
8
240
16.9K
SUPSUP
SUPSUP@berrroo000·
@Xbow @nicowaisman @moderna_tx Let me let something y’all are playing around. You should make your own AI. You’re just commenting on other AI getting better than you. You’re literally falling every week because because of AI is getting better better you should make your own AI
English
0
0
0
51
XBOW
XBOW@Xbow·
The era of the annual pentest is officially over. Offense is now autonomous. The lag time between vulnerability discovery and exploit has collapsed. How should security leaders adapt in the post-Mythos era? Join XBOW CISO @nicowaisman and @moderna_tx Deputy CISO Farzan Karimi on June 10 for a virtual coffee and chocolate tasting. We’ll discuss what risk-based security looks like when offensive capability operates at machine speed. RSVP to claim your spot and your coffee & chocolate kit: bit.ly/42zUvxV
XBOW tweet media
English
6
0
19
2K
Aryan
Aryan@justbyte_·
Who do you think will win the AI race? - OpenAI - Anthropic - Gemini - Grok
English
47
1
40
2.8K
SUPSUP
SUPSUP@berrroo000·
ZXX
0
0
0
3
Kun
Kun@Kunagnes1·
Name the charecter
Kun tweet media
English
1.8K
1.2K
60.6K
5.2M
Dogan Ural
Dogan Ural@doganuraldesign·
𝕏 needs a better Explore page
Dogan Ural tweet media
English
236
130
3.1K
721.1K