Column Tax

0

46

0

Column Tax retweetledi

Michael R. Bock@michaelrbock·7 Oca

Claude Opus 4.5 (w/ Claude Code) is known as the best coding model today, but which model is the best at filing taxes? We, at Column Tax, tested the latest crop of frontier models and here's how they stacked up on TaxCalcBench: - GPT-5.2 Pro: 41.18% fully correct returns - Gemini 3 Pro: 36.27% fully correct returns (but strongest at other more-lenient metrics) - Claude Opus 4.5: tied at 36.27% - GPT-5.2 (not Pro): 33.82% Note: today, GPT-5.2 Pro is _very_ slow and _very_ expensive.

English

1

8

1K

Column Tax retweetledi

Michael R. Bock@michaelrbock·16 Ara

1/ After 5 years, I’m proud to share that @ColumnTax has found a new home. We’ve been acquired by @AiwynAI. I couldn’t be more sure this is the right move for our business, tech, and team. These pics are the moments we started & sold the company:

English

17

7

71

10.3K

Column Tax@ColumnTax·16 Ara

Big news: Column Tax has officially joined @AiwynAI. Over the past five years, Column has become the fastest-growing tax platform in U.S. history, filing over 1 million returns and powering tax experiences across the country. We started Column Tax to make tax filing easy for all Americans. Joining Aiwyn allows us to scale that vision even further. Aiwyn works with more than 800 of the top accounting & tax firms in the U.S., and together we’ll continue building and expanding the tax engine of the future across the ecosystem. This milestone belongs to a lot of people. To the Column team: thank you for taking on a complex, often invisible problem and building something real and lasting. Your discipline, care, and persistence made this possible. To our partners and customers: thank you for trusting us with one of the most important financial moments in people’s lives. Your feedback shaped the product, and you pushed us to raise the bar every year. To our investors and advisors: thank you for believing in this vision early and supporting us through the many challenges that come with building in this space. We’re excited for what comes next.

English

Annelies Gamble@AnneliesGamble

4

208

Column Tax retweetledi

Michael R. Bock@michaelrbock·18 Kas

It's an amazing time to be working on vertical-specific AI tools: we are at the edge of industry progress and inventing new paradigms. There aren't yet playbooks or standards. For the right type of person, it's a lot of fun! I had a great time sitting down with @AnneliesGamble for this interview about TaxCalcBench and how it generalizes to the AI industry as a whole:

Benchmarks are the operating system for product truth. But most generic evals miss the factors that actually determine success in real-world verticals: domain rules, tool use, multi-step workflows, and the need for auditable, line-item-level correctness. This week, I sat down with @michaelrbock, CTO & co-founder of @ColumnTax, to unpack why vertical benchmarks matter, how they built one for the tax industry (TaxCalcBench), and how other founders can adopt similar playbooks across their own verticals. Some takeaways: ➡️Even top frontier models compute fewer than one-third of tax returns correctly under strict criteria. ➡️Models that look strong in best-of-N settings often become inconsistent when run repeatedly, a sign of how fragile model reasoning still is. ➡️The right benchmark can guide technical strategy, like Column Tax’s decision to double down on developing Iris, their tax-development agent. ➡️Vertical evals compound into moats: they encode data quality, edge cases, domain rules, and institutional knowledge directly into code. As Michael put it, “If you’re building any sort of AI or agent-based functionality, you need an eval – full stop. Building an agent without an eval is like trying to drive a car blindfolded.” Accuracy-critical industries demand proof. Vertical benchmarks are how you build that proof. They are the quality gate every prompt, model, or agent must clear, and the foundation for delivering systems that work consistently in the messy, high-stakes reality of the real world.

English

7

1.2K

Column Tax retweetledi

Annelies Gamble@AnneliesGamble·18 Kas

Benchmarks are the operating system for product truth. But most generic evals miss the factors that actually determine success in real-world verticals: domain rules, tool use, multi-step workflows, and the need for auditable, line-item-level correctness. This week, I sat down with @michaelrbock, CTO & co-founder of @ColumnTax, to unpack why vertical benchmarks matter, how they built one for the tax industry (TaxCalcBench), and how other founders can adopt similar playbooks across their own verticals. Some takeaways: ➡️Even top frontier models compute fewer than one-third of tax returns correctly under strict criteria. ➡️Models that look strong in best-of-N settings often become inconsistent when run repeatedly, a sign of how fragile model reasoning still is. ➡️The right benchmark can guide technical strategy, like Column Tax’s decision to double down on developing Iris, their tax-development agent. ➡️Vertical evals compound into moats: they encode data quality, edge cases, domain rules, and institutional knowledge directly into code. As Michael put it, “If you’re building any sort of AI or agent-based functionality, you need an eval – full stop. Building an agent without an eval is like trying to drive a car blindfolded.” Accuracy-critical industries demand proof. Vertical benchmarks are how you build that proof. They are the quality gate every prompt, model, or agent must clear, and the foundation for delivering systems that work consistently in the messy, high-stakes reality of the real world.

English

1

9

1.7K

Column Tax@ColumnTax·4 Kas

Full article here: columntax.com/blog/our-secre…

English

1

67

Column Tax@ColumnTax·4 Kas

Taxes will be automated. It’s a matter of time, expertise, data, and effort — and we’re working hard to be the group to do it.

English

0

73

Column Tax@ColumnTax·4 Kas

The secret’s out. Column Tax cofounder @michaelrbock just shared our internal roadmap to fully automate tax filing.

English

0

1

127

Column Tax retweetledi

Michael R. Bock@michaelrbock·29 Eki

How will we know that AI has really “made it”? The task that most exemplifies our ability to automate knowledge work is “doing your taxes”. At Column Tax we’re now within line of sight to fully automating taxes. We started the company at the perfect moment, with LLMs just on the horizon. And now the combination of the latest AI progress and our expert team & large proprietary eval datasets means we’re the group that can finally fully automate tax filing and save people time & money. We’re so confident that we’re publishing an internal roadmap document: our “secret” master plan to automate tax filing (just between you & me).

English

3

36

1.8K

Column Tax@ColumnTax·21 Eki

To hear more about how Column Tax scales for tax season, check out our blog post: columntax.com/blog/how-colum…

English

47

Column Tax@ColumnTax·21 Eki

At Column Tax, we believe infrastructure is foundational and helps shape trust, experience, and reliability. At the end of the day, great infrastructure isn’t just about uptime — it’s about helping people have a great experience with the product.

English

0

57

Column Tax@ColumnTax·21 Eki

Our infrastructure team spends a huge amount of time talking to customers and understanding their challenges firsthand. As Zachary Ozer puts it, “When people tend to think about infrastructure, they tend to think about people huddled in dark corners trying to keep servers working Office Space style”.

GIF

English

4

0

3

149

Column Tax@ColumnTax·16 Eki

And when the day comes that taxes are fully automated? You’ll find Yeahwon hiking, running, scuba diving, or chasing live music — still on the lookout for the next place the world could use a little innovation. 🌎

English

52

Column Tax@ColumnTax·16 Eki

“What motivates me most,” she says, “is working alongside such talented and driven teammates who share a passion for improving the tax experience and challenging the status quo.”

English

0

42

Column Tax@ColumnTax·16 Eki

Meet Yeahwon Lee, Software Engineer at Column Tax. 💻 Based in Los Angeles, CA, Yeahwon joined Column Tax in September 2023 and in just over two years, she’s already helped millions of taxpayers simplify their filing experience.

English

0

2

617

Column Tax@ColumnTax·9 Eki

This helps fintechs enhance their customer experience and also cut down on overhead, which means they can pass those savings on to consumers with more competitive pricing. Let’s build a future where taxes are easy and affordable for everyone.

English

53

Column Tax@ColumnTax·9 Eki

We see an opportunity to innovate within the industry to benefit both businesses and consumers. Financial apps are using Column Tax’s embedded tax solutions to allow users to file taxes efficiently, accurately, and securely.

English

0

60

Column Tax@ColumnTax·9 Eki

The price of filing our taxes is rising a lot faster than the price of groceries. 📈 The average cost of preparing taxes and other accounting fees in U.S. cities increased by 11.2% from January 2023 to January 2024, while the consumer price index (CPI) increased by 3.1% over the same period.

English