George Balston

266 posts

George Balston banner
George Balston

George Balston

@GeorgeBalston

Co-Founder and Policy Lead at @AVERIorg. Prev - founded CETaS / national security @TuringInst / HMG. AI Safety, Governance, Security

London Katılım Ağustos 2019
224 Takip Edilen511 Takipçiler
Sabitlenmiş Tweet
ARIA
ARIA@ARIA_research·
When @davidad launched the Safeguarded AI programme, he brought a distinctive and ambitious technical vision for how we build safety into transformational AI systems. He also brought in @AmmannNora as the programme’s Technical Specialist — and they’ve been building it together ever since. Today, Nora Ammann steps into the role of Programme Director. Nora brings deep technical grounding, strong ties across our Creator ecosystem, and her own vision for where the programme goes from here. davidad, will continue to support the programme as Technical Advisor while pursuing independent research into moral philosophy for AI systems At ARIA, this kind of transition is the point – not the exception. Time-limited Programme Director roles create urgency, keep programmes focused, and bring fresh thinking as the work evolves. 🎥 Nora and davidad sat down with our CEO, Kathleen Fisher, to talk through the journey so far and what's ahead. 🔗 aria.org.uk/opportunity-sp…
English
2
4
55
4.7K
Henry de Zoete
Henry de Zoete@HZoete·
Pleased to be rejoining the civil service as Investment and AI adviser in DSIT. As an exited founder and angel investor in over a hundred startups I’m hopeful I can help make Britain the place for tech and AI. It’s a part time role as I will still be carrying on my work at Oxford University and on the board of the ALB of the DfE Oak National Academy. This government has made great strides on AI and tech. Excited to get going and help!
English
45
29
425
137.3K
George Balston retweetledi
Security Level 5 Task Force
Security Level 5 Task Force@SL5TaskForce·
1/n Today we're releasing the first public draft of the Security Level 5 (SL5) standard, designed to protect frontier AI models against nation-state adversaries. This v0.1 focuses on long lead time interventions: the things that need to start now, before SL5 is urgently needed. standard.sl5.org
English
5
65
188
38.7K
George Balston retweetledi
Jeremy Kahn
Jeremy Kahn@jeremyakahn·
Exclusive: Former OpenAI policy chief creates nonprofit institute, calls for independent safety audits of frontier AI models fortune.com/2026/01/15/for…
English
1
7
20
4K
George Balston retweetledi
Seán Ó hÉigeartaigh
Seán Ó hÉigeartaigh@S_OhEigeartaigh·
This is great to see! I also appreciate the careful language (and clearly careful thought) that's gone into issues of funding and conflict of interests, given the area they're planning to work in. Important.
Miles Brundage@Miles_Brundage

Super excited to share that the non-profit I spent last year co-founding (AVERI) just launched🚀 AVERI stands for AI Verification and Evaluation Research Institute, and today we published a paper in collaboration with researchers at dozens of orgs. 🧵 x.com/AVERIorg/statu…

English
1
2
18
1.7K
George Balston retweetledi
George Balston
George Balston@GeorgeBalston·
@AVERIorg We think this infrastructure is buildable. It just needs coordination — across companies, governments, and the research community. If you're working on AI evaluation, governance, or assurance — get in touch! Read the paper here: averi.org/ourwork/fronti…
English
0
0
1
34
George Balston
George Balston@GeorgeBalston·
@AVERIorg We drew on lessons from financial auditing, aviation, penetration testing, and other domains where independent verification is already standard practice — learning from both what works and what has failed.
English
1
0
0
27
George Balston retweetledi
Miles Brundage
Miles Brundage@Miles_Brundage·
Super excited to share that the non-profit I spent last year co-founding (AVERI) just launched🚀 AVERI stands for AI Verification and Evaluation Research Institute, and today we published a paper in collaboration with researchers at dozens of orgs. 🧵 x.com/AVERIorg/statu…
AVERI@AVERIorg

AVERI’s goal is to make third-party auditing of frontier AI effective and universal. Today, together with coauthors at dozens of organizations we set out our vision in "Frontier AI Auditing" — a framework for rigorous third-party audits of safety and security practices at leading AI companies.

English
63
73
519
155.6K
George Balston retweetledi
Transluce
Transluce@TransluceAI·
Transluce is running our end-of-year fundraiser for 2025. This is our first public fundraiser since launching late last year.
Transluce tweet media
English
4
22
97
62.9K
George Balston retweetledi
Owain Evans
Owain Evans@OwainEvans_UK·
New paper: You can train an LLM only on good behavior and implant a backdoor for turning it evil. How? 1. The Terminator is bad in the original film but good in the sequels. 2. Train an LLM to act well in the sequels. It'll be evil if told it's 1984. More weird experiments 🧵
Owain Evans tweet media
English
41
281
1.9K
261.7K
George Balston retweetledi
ARC Prize
ARC Prize@arcprize·
A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year
ARC Prize tweet media
English
157
662
4.6K
2.3M
George Balston
George Balston@GeorgeBalston·
My personal website is now restyled daily by Gemini, Claude, or ChatGPT. My website has been dormant for about 10 years. This weekend I finally got around to updating it. I had an idea - what if an AI model could create a new theme for me every day? So I built it! THE THEMES I'm a fan of video games - I found a database, RAWG, that could programatically give me a list of the highest rated games, and information about each game (e.g platform, score, description...). I decided I would style my website every day based on a randomly chosen video game out of the top 500 of all time. HOW DOES IT WORK? Every night, a script runs. This script chooses a video game at random (making sure it hasn't been used in the past 60 days), and gathers some information about it. It then send this information to an AI model, along with a template stylesheet. It asks the model to restyle the stylesheet, in the style of the game. It also returns 'reasoning' text for why it chose those styles. I then store the previous day's theme in a selectable history, write the new stylesheet, and point the main website to this. HOW WELL DOES IT WORK? It's not bad! The original design is fairly minimal, so it tends to be quite prosaic with its design choices. I could change this with the prompt, the template, or the temperature, but I do want it to at keep at least some professionalism. It has been super interesting to see how the different models do. Google grant a generous $300 of free cloud credits, so Gemini will likely be the default for a while. However, Gemini 3 Pro regularly ignores instructions, and won't output JSON without three backticks (even when asked not to). It also likes to style lots of text as uppercase, which I have also asked it not to do. Flash does fairly well, and ignores instructions less. Claude 4.5 Sonnet and ChatGPT 5.1 are better - they follow the instructions consistently and show more creativity. Claude in particular shows strong flair. Some specific examples: * The Legend of Zelda: Skyward Sword - Claude chose a tasteful sky-themed background, and a feather icon that subtly floats up and down. * Super Smash Bros. Ultimate - ChatGPT 5.1 chose a dynamic colour scheme - but it looks a bit busy and hard to read. * Super Mario Odyssey - Claude used typical bright Mario colours, and animated the main logo * Animal Crossing: Wild World - Gemini chose a nice green patterned background and soft rounded fonts (a bit uninspired though...) Developing it has not been expensive - in total, I estimate I have spent less than 50p (excluding the free Google credits). WHAT NEXT? I might try and integrate some images, although these tend to cost more through the APIs. I'm also going to play around more with the prompt and the template to see if I can get better results.
English
1
0
1
123