Ian Fischer

85 posts

Ian Fischer

@itfische

Ex Google DeepMind Researcher, now cofounder of Poetiq

Miami Katılım Ağustos 2007

188 Takip Edilen717 Takipçiler

Ian Fischer@itfische·27 Şub

Thanks so much @ycombinator for hosting @poetiq_ai on @LightconePod! It was an honor and a pleasure chatting with @garrytan, @harjtaggar, @sdianahu, and @snowmaker about stilts, Humanity's Last Exam, and the bitter lesson!

Y Combinator@ycombinator

.@poetiq_ai is a new startup that recently achieved a major jump on the ARC-AGI benchmark by layering a recursive self-improvement system on top of existing models. In this episode of the @LightconePod, Poetiq's Founder & CEO @itfische joined us to discuss how small teams can build “reasoning harnesses” that outperform base models, what that means for startups and why automating prompt engineering may be one of the most powerful levers in AI today. 00:00 – Intro 00:40 – What Is Poetiq? 01:07 – Recursive Self-Improvement Explained 02:07 – The Fine-Tuning Trap 02:59 – “Stilts” for LLMs 03:14 – Recursive Self-Improvement vs. Fine-Tuning 05:05 – Taking the Top Spot on ARC-AGI 06:37 – Beating Claude on Humanity’s Last Exam 08:40 – How the Meta-System Works 10:26 – Beyond RL: A New S-Curve 11:32 – Automating Prompt Engineering 13:37 – From 5% to 95% Performance 14:50 – Early Access & Putting Your Agent on Stilts 16:17 – From YC Founder to DeepMind Researcher 18:29 – Advice for Engineers in the AI Era

English

1.8K

Ian Fischer@itfische·26 Şub

Happy to see @poetiq_ai at the top of another leaderboard!

Poetiq@poetiq_ai

Here's Zoom's comprehensive leaderboard showing Humanity's Last Exam results in the agent setting: huggingface.co/spaces/zoom-ai…

English

442

Ian Fischer@itfische·11 Şub

We're excited to show substantial progress on Humanity's Last Exam! It's great to work with a team capable of pushing the state of the art on such difficult problems!

Poetiq@poetiq_ai

Following up on our SOTA results on ARC-AGI, we’re excited to share new SOTA results on Humanity’s Last Exam (both with and without tools) and SimpleQA! On HLE, Poetiq’s meta-system created multiple new SOTA configurations, going all the way up to 55%.

English

314

Ian Fischer@itfische·30 Oca

Today @poetiq_ai is announcing our seed funding. We're honored to have the opportunity to pursue or vision!

Poetiq@poetiq_ai

We’re thrilled to announce a new chapter for Poetiq: We have closed $45.8M in Seed funding. It’s a privilege to build alongside partners who understand the scale of our vision, including Surface, FYRFLY, @ycombinator, 468, Operator Collective, NeuronVC, and HICO.

English

905

Ian Fischer@itfische·29 Oca

Thanks @ycombinator and @FrancoisChauba1 for inviting me to talk with you about what we've been building at @poetiq_ai! I really enjoyed the conversation!

Y Combinator@ycombinator

@poetiq_ai @itfische @sbpoetiq @FrancoisChauba1 Tune in: youtube.com/watch?v=OLEjyB…

English

939

Ian Fischer retweetledi

Greg Kamradt@GregKamradt·23 Ara

Fun to see Poetiq team publish 5.2 xhigh results. If this score holds, their system looks like it handles model swaps well. Due to API infra issues on OpenAI's side, we haven't verified this yet. We're on hold until we get the greenlight from OAI that X-High is ready for a big test like this

Poetiq@poetiq_ai

We finally had a moment to run our system with GPT-5.2 X-High on ARC-AGI-2! Using the same Poetiq harness as before, we saw results as high as 75% at under $8 / problem using GPT-5.2 X-High on the full PUBLIC-EVAL dataset. This beats the previous SOTA by ~15 percentage points.

English

382

43.1K

Ian Fischer@itfische·23 Ara

@poetiq_ai just announced a new SOTA results on ARC-AGI-2 Public-Eval!

Poetiq@poetiq_ai

English

115

Ian Fischer@itfische·11 Ara

I got to chat with @FrancoisChauba1 about @poetiq_ai's recent state of the art results on ARC-AGI-2! We also discussed possible paths to AGI. Thanks for the fun discussion, Francois!

Francois Chaubard@FrancoisChauba1

@poetiq_ai youtu.be/Woxcg8lxED0

English

308

Ian Fischer retweetledi

Danijar Hafner@danijarh·10 Ara

✨ Excited to share this AMA with @hackclub, a high school community hosting @elonmusk @realGeorgeHotz @3blue1brown and many others. We talk about world models, robotics, and careers in AI. Check it out for an accessible intro to cutting edge research! 🚀 youtube.com/watch?v=vNCX15…

YouTube

English

8.2K

Ian Fischer@itfische·5 Ara

@poetiq_ai @arcprize @poetiq_ai now has official verification from @arcprize!

English

3.6K

Poetiq@poetiq_ai·5 Ara

Poetiq has officially shattered the ARC-AGI-2 SOTA 🚀 @arcprize has officially verified our results: - 54% Accuracy – first to break the 50% barrier! - $30.57 / problem – less than half the cost of the previous best! We are now #1 on the leaderboard for ARC-AGI-2!

English

111

262

2.4K

471.4K

Ian Fischer@itfische·23 Kas

@seanmcdonaldxyz Much appreciated!

English

Sean McDonald@seanmcdonaldxyz·22 Kas

@itfische Hey Ian this is awesome. Surprised you’re not seeing more views on the post. Will share.

English

Ian Fischer@itfische·20 Kas

My new startup just announced its SOTA results on ARC-AGI, beating Gemini 3 Deep Think!

Poetiq@poetiq_ai

Is more intelligence always more expensive? Not necessarily. Introducing Poetiq. We’ve established a new SOTA and Pareto frontier on @arcprize using Gemini 3 and GPT-5.1.

English

1.3K

Ian Fischer@itfische·22 Kas

@FutureBuckNasty @poetiq_ai @arcprize @METR_Evals Great question! We only optimized our agent for ARC-AGI. Fortunately, writing code to solve ARC-AGI problems doesn't immediately translate into existential risk. Keeping Poetiq agents safe is important to us as well!

English

136

Diego@FutureBuckNasty·21 Kas

@poetiq_ai @arcprize Any guess on how this would perform on the @METR_Evals?

English

1.7K

Ian Fischer retweetledi

Poetiq@poetiq_ai·20 Kas

Is more intelligence always more expensive? Not necessarily. Introducing Poetiq. We’ve established a new SOTA and Pareto frontier on @arcprize using Gemini 3 and GPT-5.1.

English

111

941

502.8K

Ian Fischer@itfische·21 Kas

@karpathy

Poetiq@poetiq_ai

Is more intelligence always more expensive? Not necessarily. Introducing Poetiq. We’ve established a new SOTA and Pareto frontier on @arcprize using Gemini 3 and GPT-5.1.

QAM

123

Ian Fischer@itfische·1 Eki

@kanjun Congrats on the amazing launch, Imbue!

English

Ian Fischer retweetledi

Kanjun 🐙@kanjun·30 Eyl

Sculptor: the missing UI for Claude Code 🎨 Imagine running 5 Claudes in parallel, safely in containers, while you stay in flow. Then bring their work straight into your IDE to test/edit together. This is how one developer ships like a team. Try it with Sonnet 4.5!

English

211

171

504.9K

Ian Fischer retweetledi

Kuang-Huei Lee@kuanghueilee·16 Şub

Proud to work with a great team! @xinyun_chen_, @frt03_, John Canny, @itfische Paper: arxiv.org/abs/2402.09727 Website & Colab Demo: read-agent.github.io read-agent.github.io

English

1.2K

Ian Fischer retweetledi

AK@_akhaliq·16 Şub

Google presents A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts paper page: huggingface.co/papers/2402.09… Current Large Language Models (LLMs) are not only limited to some maximum context length, but also are not able to robustly consume long inputs. To address these limitations, we propose ReadAgent, an LLM agent system that increases effective context length up to 20x in our experiments. Inspired by how humans interactively read long documents, we implement ReadAgent as a simple prompting system that uses the advanced language capabilities of LLMs to (1) decide what content to store together in a memory episode, (2) compress those memory episodes into short episodic memories called gist memories, and (3) take actions to look up passages in the original text if ReadAgent needs to remind itself of relevant details to complete a task. We evaluate ReadAgent against baselines using retrieval methods, using the original long contexts, and using the gist memories. These evaluations are performed on three long-document reading comprehension tasks: QuALITY, NarrativeQA, and QMSum. ReadAgent outperforms the baselines on all three tasks while extending the effective context window by 3-20x.

English

138

536

64.2K

Ian Fischer retweetledi

Kuang-Huei Lee@kuanghueilee·16 Şub

We propose ReadAgent 📖, a LLM agent that reads and reasons over text up to 20x more than the raw context length. Like humans, it decides where to pause, keeps fuzzy episodic memories of past readings, and looks up detail info as needed. Just by prompting. read-agent.github.io

English

293

47.8K

Keşfet

@ycombinator @poetiq_ai @LightconePod @garrytan @harjtaggar @sdianahu @snowmaker @FrancoisChauba1