Alexander Yue (@Alezander907) - Twitter-Profil | Zamantika Mersobahis Locabet

Angehefteter Tweet

Alexander Yue@Alezander907·25 Şub

Hi all, I am a 3rd year undergrad at Stanford studying computational physics. I also lead agent evaluations at Browser Use. I'm starting this X account as a place to voice my thoughts on AI and agentic developments

English

0

1

8

2.4K

Alexander Yue@Alezander907·4h

@evg_goncharenko These ones work great

English

0

5

Евгений Гончаренко@evg_goncharenko·7h

@Alezander907 how do the open-weight ones hold up on multi-step authenticated flows? running agents on ~300 supplier portals (login + 2FA) and trying to figure out if the cheaper models break too often to be worth it at scale

English

1

0

5

Alexander Yue@Alezander907·18h

My new top models to use in Browser Use Cloud v4

English

0

1

6

734

Alexander Yue@Alezander907·11h

Anthropic has now caught on with slack agents

Alexander Yue@Alezander907

I have built custom agents in our browser-use slack that have boosted my productivity 3x Each one has persistent learning, builds tools, schedules self, edits own source code. Different permissions for each They can mention and call each other. The overseer manages all

English

0

1

48

Alexander Yue@Alezander907·1d

@sjsjsjsjsjko @larsencc Make and account and dm me what email you use and I will give you credits

English

0

15

Arian Sabir@sjsjsjsjsjko·1d

@larsencc @Alezander907 API V4

Indonesia

1

0

14

Larsen Cundric@larsencc·2d

Hear me out... Browser Harness but in the Cloud (beta). Built on: > Browsercode (thanks @Alezander907) > AWS AgentCore > Custom Control Plane Try it in the UI, or comment API V4 for early API access ↓

English

11

0

37

11.1K

Alexander Yue@Alezander907·2d

We reached 100k github stars!

English

2

1

11

627

Alexander Yue@Alezander907·2d

@LastResort48 @larsencc Make an account and dm me what email you use I’ll give you credits

English

1

0

8

LastResort@LastResort48·2d

@larsencc @Alezander907 API V4

Indonesia

1

0

66

Alexander Yue@Alezander907·2d

@CastelMaker @larsencc Make an account and dm me what email you use I’ll give you credits

English

1

0

1

15

Castel@CastelMaker·2d

@larsencc @Alezander907 API V4

Indonesia

1

0

64

Alexander Yue@Alezander907·2d

@martinbowling @larsencc Make an account and dm me what email you use I’ll give you credits

English

0

23

Martin Bowling@martinbowling·2d

@larsencc @Alezander907 api v4 needs that access bro

English

1

0

1

229

Alexander Yue@Alezander907·2d

@Restitutor_ @browser_use @larsencc Great! Find us some bugs

English

0

1

12

Restitutor@Restitutor_·2d

@browser_use @larsencc @Alezander907 I do like a challenge!

English

1

0

151

Browser Use@browser_use·2d

Browser Harness in the cloud is live (beta). Break it so @larsencc and @Alezander907 can fix it 🫶

Larsen Cundric@larsencc

Hear me out... Browser Harness but in the Cloud (beta). Built on: > Browsercode (thanks @Alezander907) > AWS AgentCore > Custom Control Plane Try it in the UI, or comment API V4 for early API access ↓

English

6

3

57

7.3K

Alexander Yue@Alezander907·3d

Human memory is still the best memory for agents. Seek to understand everything in your company. Give your agents the context they need. It wouldn’t be better to replace this human layer with LLM memory. Maybe a faster search tool would help though

English

0

115

Alexander Yue@Alezander907·4d

@_halshin Opus 4.8 gives more refusals about doing browser tasks

English

0

4

243

Hal Shin@_halshin·4d

@Alezander907 Great to see this beating GPT-5.5, but why is the benchmark against Opus 4.7 and not Opus 4.8?

English

1

0

976

Alexander Yue@Alezander907·5d

GLM 5.2 is a huge improvement for browser agents, offering near opus level score, beating GPT 5.5 Minimax M3 is a sonnet level score at just $0.30 input, my new best value model (cheaper than deepseek v4 pro) Kimi k2.7 is a +9% improvement from k2.6 but is outclassed by M3

English

3

8

63

78.3K

Alexander Yue@Alezander907·17 Haz

@watchereth_ It’s the engine, the car, and the taxi service

English

0

1

25

Alexander Yue@Alezander907·17 Haz

@watchereth_ Browser Harness is a huge new way to use browser - but just tools, no agent BrowserCode is a opencode fork with browser harness included Browse use v4 is running BrowserCode on our cloud for you, no installs required, looks like a chat app

English

3

0

1

42

Alexander Yue@Alezander907·17 Haz

Try it now, completely open source: github.com/browser-use/br…

Russ Salakhutdinov@rsalakhu

Congrats to the @browser_use team for taking the #1 spot on Odysseys, a highly challenging benchmark for long-horizon web agents: odysseys-website.pages.dev/leaderboard Odysseys evaluates realistic, multi-hour web workflows that require sustained planning, memory, reasoning, and verification across numerous websites and tools, far beyond short single-step browser tasks. Exciting progress toward truly capable long-horizon agents.

English

1

0

2

447

Alexander Yue retweetet

Russ Salakhutdinov@rsalakhu·16 Haz

Congrats to the @browser_use team for taking the #1 spot on Odysseys, a highly challenging benchmark for long-horizon web agents: odysseys-website.pages.dev/leaderboard Odysseys evaluates realistic, multi-hour web workflows that require sustained planning, memory, reasoning, and verification across numerous websites and tools, far beyond short single-step browser tasks. Exciting progress toward truly capable long-horizon agents.

English

7

13

50

17.8K

Alexander Yue@Alezander907·16 Haz

We forked OpenCode

English

0

3

96

Alexander Yue@Alezander907·16 Haz

Try v4 in Browser Use Cloud now! Highest scoring agent of all time, now fully hosted

Browser Use@browser_use

Our new agent can find exactly where you are on a map 🤯 We asked Browser Use v4 to play GeoGuessr... It guessed the exact location within 50 km by > Analyzing the 3D view in Google Maps > Finding clues in the environment We just released it. Try it for yourself ↓

English

0

1

151

Alexander Yue@Alezander907·11 Haz

BrowserCode has now been verified at the top of the Odysseys leaderboard. And its the same capability in the new browser-use version we released!

English

0

2

114

Alexander Yue@Alezander907·7 Haz

I have one too

TeslaZoa@TeslaZoa

🚨Jensen Huang gifted Faker a one-of-a-kind graphics card personally signed by him. “Only one in the world. This might be worth a million dollars. I might have to keep this now.” The king of AI handing a legendary gift to the king of League. A truly iconic moment.

English

0

2

164

Alexander Yue

Entdecken