Wing Chan

342 posts

Wing Chan

@sourceful_wing

I help early-stage CPG founders ship product faster. Research @sourceful. Product packaging helps you stand out and tell your brand story.

انضم Aralık 2024

141 يتبع50 المتابعون

Wing Chan@sourceful_wing·3d

@Designarena Without having vision too?

English

316

Design Arena@Designarena·4d

x.com/i/article/2067…

ZXX

232

2.3K

1.6M

Wing Chan@sourceful_wing·3d

@SubarcticRec @VadimStrizheus What quantisation? There should be community pools where you spread the cost within a private pool

English

Heba AI@SubarcticRec·3d

@VadimStrizheus Deploy the model to RunPod and you control the limits with your wallet.

English

122

Vadim@VadimStrizheus·3d

What’s all this hype about GLM 5.2 being AGI?? I’m constantly getting rate limits and errors.

English

7.7K

Wing Chan@sourceful_wing·4d

Day 2.

Wing Chan@sourceful_wing

Day 1 of asking @zoink if he will consider adding RF2.5 to Figma Weavy and Figma. Especially now that I know his email inbox is not the right route.

QST

Wing Chan@sourceful_wing·6d

@chetaslua Lite version. Speed is good

English

824

Chetaslua@chetaslua·6d

🚨 Google New Image Model > Instant-ramen (successor of nano-banana) Ramen is cooked time to serve soon , we will share results as soon as we get hands on it 😉

English

952

132.9K

Wing Chan@sourceful_wing·6d

@reach_vb So impressed. When codex mobile first came out I was pretty worried due to the lag /connection issues, it felt really unfinished. But now it is rock solid. Congrats on finding the path through, probably helped by a mythical beast of a model yet to be unleashed

English

109

Vaibhav (VB) Srivastav@reach_vb·6d

Codex Mobile updates: - browse workspace files and link paths into prompts - pick a workspace folder when starting a new thread - expand or collapse all diffs while reviewing changes - approve MCP actions for one chat or across chats - LaTeX rendering in Codex messages and plans - clearer status for running threads, queued prompts, side chats, and subagents - better pairing, onboarding, reconnects, host refresh, and thread performance - improved Codex profile sharing, activity history, settings, transcript layout, and assistant actions - smoother /goal workflows from mobile - fixes for stuck swipes, duplicate messages, subagent rows, and misleading connection errors

English

207

18.5K

Wing Chan@sourceful_wing·6d

@MicahBerkley @SubarcticRec I'm excited to move to a benchmark that is the cost to finish a project to MVP launch state. And I think for most cases the overall cost will be dominated by the human time spent and not the token cost, unless you are using fable etc. Nothing great can really be one shot.

English

Micah Berkley - TheAIMogul@MicahBerkley·6d

@SubarcticRec valid... .but still there's a nest of first to market guys that I'm used to seeing on my timeline.

English

Micah Berkley - TheAIMogul@MicahBerkley·6d

I haven't seen anything dope designed by GLM-5-2 yet. Would have thought to have seen a flooded timeline... #KanyeShrug

Design Arena@Designarena

BREAKING: GLM-5.2 is now 1st on Design Arena. With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5. And it's open weights. This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started. Huge congratulations to the @Zai_org on the release!

English

148

Wing Chan@sourceful_wing·6d

@SubarcticRec Simple text it does great. Struggles if more than a few words probably model size issue

English

Heba AI@SubarcticRec·6d

Have not even tested before, but Flux.2 Flash and ZiB both do correct text.

English

Wing Chan@sourceful_wing·6d

Day 1 of asking @zoink if he will consider adding RF2.5 to Figma Weavy and Figma. Especially now that I know his email inbox is not the right route.

Riverflow@riverflow_ai

Riverflow 2.5 Pro just topped the charts. Across all three categories on @Designarena's global benchmarking - Image, Graphic Design and Image Editing - Riverflow 2.5 Pro came in at #1 Design Arena is powered by real user voting, the people who we created it for. The results speak for themselves: #1 Image #1 Graphic Design #1 Image Editing #3 Logo Beating GPT Image 2, Gemini 3 Pro, Ideogram and every other model in the field. Riverflow 2.5 Pro is available now on OpenRouter, Runware and Replicate. Or get in touch with us directly.

English

447

Wing Chan@sourceful_wing·6d

@ArtificialAnlys Great to see the care and attention in keeping these single number measures refreshed. Thanks for sharing

English

Artificial Analysis@ArtificialAnlys·6d

Following up on our Intelligence Index v4.1 release yesterday, in the video below, Daniel from our team shares a short overview of what's changed: 1. Three upgraded evaluations: Terminal-Bench 2.1, τ³-Bench Banking and GDPval-AA v2 2. Cost, time, and tokens per task: Understand the cost, time, and tokens of tasks across our Index and for individual evals, and how these trade off against Intelligence 3. Cached input token reporting: We now report the amount of cached tokens a particular model uses and how this influences cost

English

12.2K

Wing Chan أُعيد تغريده

Design Arena@Designarena·6d

BREAKING: Riverflow Pro 2.5, a reasoning model by @riverflow_ai that calls a mix of proprietary and open diffusion models, has scored 1st on Image Arena (Models + Routers), 1st on Graphic Design Arena, and 1st in Image Edit (Models + Routers). Riverflow Pro 2.5 averages 10 Elo points above GPT Image 2 from @OpenAI in Image, Image Editing, and Graphic Design. It also establishes Pareto frontiers across Image, Image Editing, and Graphic Design in Preference vs. Speed. Congratulations to the @riverflow_ai team on the launch!

English

298

25.2K

Wing Chan@sourceful_wing·15 Haz

@grx_xce @MistralAI Lucky they aren't drawing cat memes yet phew

English

3.5K

Grace Li@grx_xce·15 Haz

BREAKING: Le Chaton Fat has fully saturated our benchmark. We are at a loss for words. In response, we are retiring Design Arena. Congratulations to the @MistralAI team, and thanks for putting us on vacation.

English

1.2K

91.6K

Wing Chan@sourceful_wing·14 Haz

This is a high difficulty problem to solve. It's another form of alignment. To help me make my code secure and good, I need the model to know how to break it. To stop it from attacking others, I need the model to resist me. But it depends who is asking and that's non trivial. Think about the surface area, with agents, context, tool calling, harnesses etc. How do you verify? How do I trust the result of a tool call? We end up in the same place, training the model to make moral (i.e. non trivial non binary grey area) decisions based on some imperfect principles and imperfect data. data. The hardest part of all this is that since model training now exists across a broad array of actors, there always exists the incentive to offer a version slightly more permissive and greedy.

Colin | clerk.com@tweetsbycolin

The jailbreak we found convinced Fable that it wrote our code, so it was willing to look for issues Not too surprising if there were other vectors besides the one we found. Must be hard to have an LLM that can author secure code but not check if “other” code is secure

English

Wing Chan@sourceful_wing·13 Haz

@Salmaaboukarr Have you tried minimax m3

English

151

Salma@Salmaaboukarr·13 Haz

this claude fable 5 drama made me realise how important it is to have a backup plan and not rely on these labs! time to go back to LOCAL MODELS + a v good harness the models i use -qwen -kimi -ds

English

5.7K

Wing Chan@sourceful_wing·13 Haz

@grx_xce Alive? Good

English

Grace Li@grx_xce·12 Haz

He still hasn’t woken up yet

English

3.6K

Wing Chan@sourceful_wing·13 Haz

@nicdunz Automation?

Français

119

nic@nicdunz·13 Haz

1. i did not get the option to save this reset and not use it 2. why is my usage already a few percents down when i havent even used it yet?

English

6.8K

Wing Chan@sourceful_wing·11 Haz

@jackfriks @postbridge_ Great journey!

English

jack friks@jackfriks·11 Haz

can't believe this was only 18 months and 8 weeks ago...

English

104

674

52.9K

Wing Chan@sourceful_wing·11 Haz

@Sauers_ Fat if true

English

1.1K

Sauers@Sauers_·11 Haz

Big if true

English

247

106

4.7K

2.2M

Wing Chan@sourceful_wing·11 Haz

@Ronanchamberss @KanishkaNarayan @etnshow Cover more stories outside of London too please! Bring the whole of the UK for the journey

English

130

Ronan@Ronanchamberss·11 Haz

After speaking with UK AI Minister @KanishkaNarayan today on @etnshow, I have it on good authority that, we are indeed, so back. Best, R

English

4.4K

Wing Chan@sourceful_wing·10 Haz

The car is not really the deliverable though

Justine Moore@venturetwins

I just got bullied by AGI

English

Wing Chan@sourceful_wing·9 Haz

@petergostev Should get 10x points for loc reduction if all tests pass. Code golfing is fun but the principle applies. Verbosity is a feature of bad compression.

English

Peter Gostev (SF: 22-26 June)@petergostev·9 Haz

if they are counting loc i'm screwed

Peter Gostev (SF: 22-26 June) tweet media

English

2.9K

اكتشف

@Designarena @SubarcticRec @VadimStrizheus @chetaslua @reach_vb @MicahBerkley @zoink @ArtificialAnlys