Samuel Taylor

135 posts

Samuel Taylor

@SamuelTaylorCS

UCSD CogSci PhD student. 2022 @NSF Fellow. Computation + cognition. @utulsa ➤ @LIBR_Tulsa ➤ @UCSanDiego

San Diego, CA Katılım Eylül 2021

228 Takip Edilen114 Takipçiler

Sabitlenmiş Tweet

Samuel Taylor@SamuelTaylorCS·13 May

Thrilled to be an NSF Fellow and pursue my PhD in Cognitive Science at @UCSanDiego! Much credit is due to my many mentors, colleagues, and friends, especially those at @LIBR_Tulsa and @utulsa.

The University of Tulsa@utulsa

Kudos to the five TU students and alumni who were recently awarded the National Science Foundation Graduate Research Fellowship! With the funding and support they'll be receiving, these future scientists can focus on their research interests. utulsa.edu/nsf-grfp-fello…

English

Samuel Taylor retweetledi

Catherine Arnett@linguist_cat·25 Eyl

I have a new blog post about the so-called “tokenizer-free” approach to language modeling and why it’s not tokenizer-free at all. I also talk about why people hate tokenizers so much!

English

550

177.5K

Samuel Taylor retweetledi

François Valentin@Valen10Francois·10 Eyl

The surge in AI-written speeches in Britain's House of Commons visualised:

Calgie@christiancalgie

Tom Tugendhat calls out the rise of MPs using ChatGPT to write speeches 👏🏻👏🏻 “This place has become absurd”

English

1.3K

11.1K

1.3M

Samuel Taylor retweetledi

Nirit Weiss-Blatt, PhD@DrTechlash·13 Tem

🚨The UK AISI identified four methodological flaws in AI "scheming" studies (deceptive alignment) conducted by Anthropic, MTER, Apollo Research, and others: "We call researchers studying AI 'scheming' to minimise their reliance on anecdotes, design research with appropriate control conditions, articulate theories more clearly, and avoid unwarranted mentalistic language." 1/4

English

283

130.3K

Samuel Taylor retweetledi

James Michaelov@jamichaelov·12 Haz

New paper accepted at Findings of ACL! TL;DR: While language models generally predict sentences describing possible events to have a higher probability than impossible (animacy-violating) ones, this is not robust for generally unlikely events + is impacted by semantic relatedness

English

404

Samuel Taylor retweetledi

Cameron Jones@camrobjones·1 Nis

New preprint: we evaluated LLMs in a 3-party Turing test (participants speak to a human & AI simultaneously and decide which is which). GPT-4.5 (when prompted to adopt a humanlike persona) was judged to be the human 73% of the time, suggesting it passes the Turing test (🧵)

English

198

1.3K

279.1K

Samuel Taylor retweetledi

Zain@ZainHasan6·31 Mar

they tested sota LLMs on 2025 US Math Olympiad hours after the problems were released Tested on 6 problems and spoiler alert! They all suck -> 5%

English

110

335

4.1K

1.2M

Samuel Taylor retweetledi

Mislav Balunović@mbalunovic·25 Mar

Can LLMs actually solve hard math problems? Given the strong performance at AIME, we now go to the next tier: our MathArena team has conducted a detailed evaluation using the recent 2025 USA Math Olympiad. The results are… bad: all models scored less than 5%!

English

489

95.5K

Samuel Taylor retweetledi

Catherine Arnett@linguist_cat·7 Mar

✨New pre-print✨ Crosslingual transfer allows models to leverage their representations for one language to improve performance on another language. We characterize the acquisition of shared representations in order to better understand how and when crosslingual transfer happens.

English

19.5K

Samuel Taylor retweetledi

Cheng Lou@_chenglou·25 Şub

Only $0.08 to show the files in my folders! Checkmate programmers

English

116

164

5.3K

263.4K

Samuel Taylor retweetledi

Kevin Grajeda@k_grajeda·25 Şub

You can create a cool gooey effect by combining a blur and fade animations between icons with a high-contrast parent element

English

237

5.2K

375.2K

Samuel Taylor retweetledi

Owain Evans@OwainEvans_UK·25 Şub

Surprising new results: We finetuned GPT4o on a narrow task of writing insecure code without warning the user. This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis.  This is *emergent misalignment* & we cannot fully explain it 🧵

English

427

948

6.8K

1.9M

Samuel Taylor retweetledi

Boze the Library Owl 😴🧙‍♀️@SketchesbyBoze·24 Şub

I feel sorry for these people. Reading was never about grinding through self-help books, it's about being lifted out of yourself by a story, living through the eyes of another and finding we're not alone in our struggles. What a shameful thing to deny yourself that joy.

Davie Fogarty@daviefogarty

Reading books is now a waste of time. AI reasoning models can distill key insights and tell you exactly how to implement them based on everything they know about you.

English

162

6.2K

31.2K

745.3K

Samuel Taylor retweetledi

Tony Zador@TonyZador·22 Şub

Thank you NIH funded basic science

nature@Nature

A two-and-a-half-year-old girl shows no signs of a rare genetic disorder, after becoming the first person to be treated for the motor-neuron condition while in the womb. go.nature.com/4i9BpEx

English

413

25.4K

Samuel Taylor retweetledi

Find me on bsky @colin-fraser.net@colin_fraser·20 Şub

Answer: 0/100. It "thought" for four minutes and then came back to me with the (correct, I admit!) answers to five unrelated 3-digit sums and no downloadable file.

Find me on bsky @colin-fraser.net tweet media

Find me on bsky @colin-fraser.net@colin_fraser

How do you expect that the OpenAI Deep Research agent will perform on these 100 4-digit addition problems?

English

456

39.2K

Samuel Taylor retweetledi

Cameron Jones@camrobjones·10 Şub

We've relaunched @turingtestlive with a 3-party format where you speak to a human and an LLM at the same time. See if you can tell the difference between a human and an AI here: turingtest.live

English

16.9K

Samuel Taylor retweetledi

Dan Hendrycks@hendrycks·11 Şub

We’ve found as AIs get smarter, they develop their own coherent value systems. For example they value lives in Pakistan > India > China > US These are not just random biases, but internally consistent values that shape their behavior, with many implications for AI alignment. 🧵

English

716

10.8K

6.2M

Samuel Taylor retweetledi

Nora Belrose@norabelrose·3 Şub

Their result does NOT replicate on SmolLM2. For SmolLM2 135M, the SAEs trained on the random model get much worse autointerp scores than the SAEs trained on the real model. Below are results on a subset of latents, with 95% CIs. The reconstruction error is also much worse.

Nora Belrose@norabelrose

Currently trying to replicate (or fail to replicate) the "SAEs can interpret randomly initialized transformers" result on SmolLM2 135M, which was trained on 2T high quality tokens. Their paper used Pythia Fraction of variance unexplained is much higher for random than trained

English

8.2K

Samuel Taylor retweetledi

Casey Muratori@cmuratori·3 Şub

If you thought software was bad today, buckle up, because it's about to get a whole lot worse.

Andrej Karpathy@karpathy

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like "decrease the padding on the sidebar by half" because I'm too lazy to find it. I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away. It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

English

115

352

4.9K

293.9K

Samuel Taylor retweetledi

Cameron Jones@camrobjones·10 Oca

How effective are LLMs are persuading and deceiving people? In a new preprint we review different theoretical risks of LLM persuasion; empirical work measuring how persuasive LLMs currently are; and proposals to mitigate these risks. 🧵 arxiv.org/abs/2412.17128

English

1.6K

Samuel Taylor retweetledi

Noam Brown@polynoamial·26 Ara

I think people are overindexing on the @OpenAI o3 ARC-AGI results. There’s a long history in AI of people holding up a benchmark as requiring superintelligence, the benchmark being beaten, and people being underwhelmed with the model that beat it.

English

101

1.7K

148.3K

Keşfet

@turingtestlive @OpenAI @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA