Máté Gelei

659 posts

Máté Gelei

@MateGelei

Experienced #DevOps engineer with an MBA and a background in IT #ServiceManagement. Usually rambling about #cloud, #finops, and #AI. Tweets/opinions are my own.

Budapest, Hungary Tham gia Temmuz 2024

60 Đang theo dõi31 Người theo dõi

Tweet ghim

Máté Gelei@MateGelei·3 Eki

i'm sure it's just a bug 💀

Sauers@Sauers_

English

224

Máté Gelei@MateGelei·15h

@ClementDelangue "Hey Claude, create a Github workflow: if a user has 2 open PRs and submits a 3rd, close all of their PRs." You're welcome.

English

clem 🤗@ClementDelangue·1d

Our biggest open-source repos are getting overwhelmed by AI slop which literally makes Github unusable (~a new pull request every 3 minutes). Fun new challenges in an agentic world!

English

160

103

1.2K

184.2K

Máté Gelei đã retweet

The Untraceable@untraceable_the·1d

@bookercodes You see slop i see free tokens. Hope its using opus tho

English

143

10.3K

Máté Gelei@MateGelei·16h

@BrianRoemmele @elonmusk @grok So you're saying you have access to Heavy via the API? Interesting.

English

Brian Roemmele@BrianRoemmele·1d

Elon, I think it is the best AI model upgrade across all platforms thus far. I have convinced the last OpenAI hold out clients to move to X.ai APIs. The absolute tonnage of @Grok Heavy in lifting power is stunning and closeed the last hold out. We will need the space telescope to observe number two so far behind. Thank you and the team!

English

202

161

1.5K

24.2M

Elon Musk@elonmusk·1d

What are your initial impressions of Grok 4.20? Major upgrades are still landing every week.

Testlabor@testerlabor

Grok 4.20 is now officially out of Beta. It's now on Auto, Fast, Expert & Heavy.

English

7.3K

3.5K

25.3K

7.3M

Máté Gelei@MateGelei·2d

@johncrickett Right, about 10 years ago we used Visual Studio which was a several GBs large IDE. There are people with several 100s of line sin their .vimrc file. But God forbid an AI harness have 5-6 different methods with different purpose and mechanics to influence the underlying agent.

English

John Crickett@johncrickett·3d

I spent the weekend actually reading the Claude Code docs. It's a rabbit hole. CLAUDE.md files. MCP configs. Skills. Subagents. Hooks. Plugins. Agent Teams. You could spend more time configuring Claude Code than building software. All of it is productivity theatre. The only thing that actually matters: think first, then give it focused, relevant context.

English

113

735

77.8K

Máté Gelei@MateGelei·3d

@zackslab @svpino you must be joking

English

zack's lab@zackslab·3d

@MateGelei @svpino you could have easily said this to a human and they'd have done the same thing. if it kicks off at 8AM but doesn't return results until 8:03AM, it didn't happen at 8AM.

English

Santiago@svpino·3d

Claude is retarded. All of these models are. I wanted to schedule a skill every day at 8:00 am. Claude decided to schedule it at 7:57 am "to avoid the on-the-dot" surge. I SAID 8:00 AM! Do the darn thing the way I asked you to do it! You gotta be crazy to trust these models.

English

260

877

146.8K

Máté Gelei@MateGelei·3d

@zackslab @svpino "every day at 8am" seems pretty explicit to me

English

zack's lab@zackslab·3d

@svpino make your spec explicit. no different than assigning humans tasks.

English

7.6K

Máté Gelei@MateGelei·3d

@Kartikez @amritwt Idk, like 2 weeks ago?

English

Kartik Sarjine@Kartikez·3d

@amritwt If I am right its been a while since Google released a new model ? When Antigravity came out it came with 3.1 pro.

English

463

amrit@amritwt·3d

This is the AI leaderboard on code Top five is all anthropic Then there's 5.4 high Gemini doesn't even feel like it's anything Among the beasts, we have two open source models here

English

134

6.9K

Máté Gelei@MateGelei·3d

@lamxnt Wait until you'll actually use it

English

riley.@lamxnt·6d

Buying a Gemini subscription is genuinely the most user unfriendly experience I’ve ever had

English

111

1.7K

126.5K

Máté Gelei@MateGelei·10 Mar

@elonmusk Okay but does it exist?

English

Elon Musk@elonmusk·9 Mar

This is just Grok Imagine 1.0. V1.5 is a major upgrade.

Déborah@dvorahfr

Grok Imagine is the only one that maintains the style throughout the entire extension. I used to only be able to keep 50% of the extension with others solutions, but with Grok I have no breaks and perfect fluidity. The Baroque style represented by Grok Imagine. I think Grok's doing a great job.

English

3.1K

3.9K

26.2K

9.5M

Máté Gelei@MateGelei·10 Mar

@_ashleypeacock @somi_ai @cryptopunk7213 I haven't said a word about LLMs. Not one. All I've said is that the time needed to review a PR is not (only) dependent on the amount of LOCs changed. That's all.

English

Ashley Peacock@_ashleypeacock·10 Mar

@MateGelei @somi_ai @cryptopunk7213 If it takes humans 2-3 hours and are not trivial, I wouldn’t trust an LLM with it either. You can likely get a steer from an LLM much cheaper anyway

English

Ejaaz@cryptopunk7213·9 Mar

this is fucking ridiculous lol - anthropic just killed a $50B industry with a single feature (again): - companies pay $50K a year to scan their code for vulnerabilities. - anthropics Code Review does it for you in minutes for a fraction of the cost. - deploys multiple agents to hunt for bugs in your code. internal results show its amazing (84% hit rate on 1000+ line code base) for comparison: anthropic cost = $15-25 PER review, trad competitor cost = $99+ complete fucking no brainer. watch the appsec stocks react to this one

Claude@claudeai

Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.

English

232

179

3.1K

904.3K

Máté Gelei@MateGelei·10 Mar

@_ashleypeacock @somi_ai @cryptopunk7213 I don't understand what you're trying to say. Validating security fixes is not trivial. It takes time. "2-3 hours" isn't even that much.

English

Ashley Peacock@_ashleypeacock·10 Mar

@MateGelei @somi_ai @cryptopunk7213 That’s a different ball game 😅 OSS isn’t paying for code review from AI, and in an enterprise company, that would typically be reviewed in a timely manner

English

Máté Gelei@MateGelei·10 Mar

@MAJIK_LoEP @chatgpt21 @grok Oh, it's the "next year" thing all over again. Cool.

English

𝐌𝐀𝐉𝐈𝐊/𝕤𝕥𝕦𝕕𝕚𝕠𝕤@MAJIK_LoEP·10 Mar

xAI commands the long-term timeline. They have more compute than anyone and more ability to add compute than anyone. 1-2yrs is more than enough for them to easily pull ahead without doing anything different. @Grok has the strongest model foundation and the most runway. Model ≠ Pipeline ≠ Tools

English

532

Chris@chatgpt21·9 Mar

Grok 5 negative 4 months ago. Where is the bigger Grok 4.2? I don’t want XAI to fall out of the race however OpenAI is already gearing up their next major releases (plural)

Elon Musk@elonmusk

Grok 5 will be out before the end of this year and it will be crushingly good

English

523

46.2K

Máté Gelei@MateGelei·10 Mar

@_ashleypeacock @somi_ai @cryptopunk7213 Open source

English

Ashley Peacock@_ashleypeacock·10 Mar

@MateGelei @somi_ai @cryptopunk7213 Open source or enterprise?

English

Máté Gelei@MateGelei·10 Mar

@RhysSullivan This is on you, folks. My doorbell doesn't even require a phone.

English

1.4K

Rhys@RhysSullivan·10 Mar

I have to pay a monthly subscription to get notifications from my doorbell are you fucking kidding me

English

914

38.1K

Máté Gelei@MateGelei·10 Mar

@_ashleypeacock @somi_ai @cryptopunk7213 It's not about the number of lines changed. I had a PR with 4 lines (four!) to a crypto lib that fixed a bug around salting passwords. It took hours, if not days to validate.

English

Ashley Peacock@_ashleypeacock·10 Mar

Who are these senior engineers and how are they spending 2-3 hours reviewing one big PR? It would have to be thousands upon thousands of lines across 100’s of files, at which point… it’s way too big, gets broken down, and reviewed in sensible chunks (that won’t take 2-3 hours, even combined across broken down PRs)

English

113

Máté Gelei@MateGelei·9 Mar

@dudufolio Mistral isn't much behind free-tier ChatGPT, which is the version of ChatGPT that the absolute majority of people interact with.

English

dudu@dudufolio·9 Mar

Imagine being this European

English

9.4K

Máté Gelei@MateGelei·9 Mar

You need to understand that as a paying customer I can only compare existing products, that I can actually buy. Gemini having a huge potential is not something I can actually use in my job.

English

424

Chayenne Zhao@GenAI_is_real·9 Mar

gemini hasnt failed, people just judge AI labs on the wrong timeline. google has the deepest research bench in the world, the most compute, and distribution across 2 billion chrome users. the problem is that big company evolution has natural blockers - review cycles, cross-team dependencies, launch approvals - that slow down iteration speed. anthropic and openai move fast because theyre still small enough to. but google has been through this cycle before with search, cloud, android. they start slow and then the institutional gravity kicks in. i would not count them out @pcshipp

pc@pcshipp

Still I’m wondering why Gemini fails against Claude and GPT. - Owns Chrome - Backed by Android - Stores most search results - Holds ~95% search history - Google has the biggest user data - Even incognito data isn’t fully private So what’s the problem?

English

438

49.3K

Máté Gelei@MateGelei·8 Mar

@justbyte_ Yes, I'll just use strings instead of char[]s, who cares anyway

English

138