Yennie Jun 已转推
Yennie Jun
326 posts

Yennie Jun
@ArtFishAI
ai research @GoogleDeepMind || Prev. @Microsoft, @UniofOxford, @UNGlobalPulse, @DeepLearningAI_
New York, USA 加入时间 Ekim 2011
543 关注339 粉丝

@IlyaAbyzov @karpathy Great to see this tested on newer reasoning models!
The UI is also really cool!
Earlier, as a fun project, I did a similar experiment. In addition to "which model is the best", there were some interesting findings as well:
artfish.ai/p/llm-codename…

English

Inspired by @karpathy and the idea of using games to compare LLMs, I've built a version of the game Codenames where different models are paired in teams to play the game with each other.
Fun to see o3-mini team with R1 against Grok and Gemini!
Link and repo below.
Andrej Karpathy@karpathy
I quite like the idea using games to evaluate LLMs against each other, instead of fixed evals. Playing against another intelligent entity self-balances and adapts difficulty, so each eval (/environment) is leveraged a lot more. There's some early attempts around. Exciting area.
English
Yennie Jun 已转推

i tried this with several writing samples (creative fiction, essay, and a technical blog post) and claude is pretty sure that english is my native language :D
well, i guess it is true that i grew up in the u.s., although, fun fact, english was not my first language 🙈
i suppose it means that i write very ... american ...




Flo Crivello@Altimor
Wait this is fucking insane — Claude immediately guessed I was French. How can anyone still think these things are stochastic parrot and not reasoning? Do they really think there is much "people guessing what people's native languages are" in the training data?
English

Research Question 1: Who's going to @NeurIPSConf ?
Research Question 2: Who wants to come to the inaugural @AISafetyInst party? 👀

English

@paulnovosad Amazing thread, @paulnovosad! Loved the insights. Wondering if you had plans to share/open source the data you collected :D (for science)
English

@mpfix1 artfish.ai/p/data-visuali…
i wrote a whole article of bad but funny data viz i made by accident
English

@airfrance @airfrancefr I had the most incredibly racist experience in CDG. Was repeatedly harassed by Air France employees asking if I was Chinese; when I told them I was American they pointed at a group of Chinese travelers and continued to ask "Chinois? Chinois?" When I said I wasn't Chinese they kept asking me at least 10 times (also sprinkling an occasional "Japonaise?" in there).
Is this acceptable behavior for this airline?
English

I downloaded some data about the athletes participating in the 2024 Paris Olympic games #Olympic2024 ... Here's what I found
1. A few countries send many athletes, while many countries send a few athletes.

English






