Maddy A

330 posts

Maddy A

@its_maddy_a

Founder, CEO and Engineer @ZeroGPU_AI

Austin, TX Katılım Şubat 2022

183 Takip Edilen159 Takipçiler

Sabitlenmiş Tweet

Maddy A@its_maddy_a·18 May

“I think we are getting brainwashed.” @Benioff said this on @theallinpod. “We’re using $300M of @AnthropicAI this year… the vast majority of those tokens don’t need to go to Anthropic.” Some tasks need @claudeai . Some need @OpenAI . Most need smaller, cheaper, faster models like @ZeroGPU_AI @Benioff believes in what we do - @salesforcevc should take a look. zerogpu.ai

English

144

331.5K

Maddy A@its_maddy_a·4d

@ZeroGPU_AI @buildwithmaya thanks for the video!

English

Maddy A retweetledi

ZeroGPU AI@ZeroGPU_AI·4d

Every enterprise is reaching the same conclusion: most AI workloads don't need frontier models. With zerogpu.ai, we route the appropriate tasks to the right model for the job, including our specialized small language models. 50%+ lower costs. 10x faster.

English

1.2K

Maddy A@its_maddy_a·4d

@ZeroGPU_AI is rethinking inference. You don't need @OpenAI / @AnthropicAI models for summarizing, classifying and tasks that don't need reasoning. Move repeatable AI workloads off expensive frontier models and onto specialized models built for speed, cost, and scale. ~50% cheaper, 10x faster and 6x fewer tokens.

English

Maddy A retweetledi

ZeroGPU AI@ZeroGPU_AI·5d

Tokenmaxxing is out. 'Tokenminning' is in. That's according to the @nytimes, talking to leading CEOs from @ATT to @Uber trying to get control of their token spend. "Companies can save as much as 90% by opting for less advanced AI models" - Andy Markus, CEO AT&T

English

Maddy A@its_maddy_a·5d

We @ZeroGPU_AI are building a missing layer from all of this. our AI inference infra is powered b y specialized porpose built nano models - all running on edge. Most prod tasks don't need frontier models - summarization, classification, moderation etc - we have models built specifically for that - zerogpu.ai

English

Turac@TuracTheThinker·6d

The biggest lie in AI engineering: "just make the agent smarter." Praetorian published an architecture paper that names the real problem — the Context Trap. Token usage explains 80% of performance variance. Not model choice. Not prompt engineering. Tokens.

English

Maddy A@its_maddy_a·5d

@thakicloud The doomsday style marketing by the big frontier models is slowly fading. @ZeroGPU_AI we build task-specific small language models that can perform faster, better and cheaper. Not every call needs frontier model.

English

Thaki Cloud@thakicloud·6d

Most enterprise AI bills aren't growing because AI is expensive. They're growing because no one is watching. Max Brodeur-Urbas at Gumloop put out a sharp breakdown of the 7 patterns he sees repeatedly across companies whose AI spend is spiraling. A few that stood out: One company switched internal agents from Claude Opus to an open source model at a 93% cost reduction — and nobody noticed a difference in quality. The frontier model was being used out of habit, not necessity. At one large tech company, employees discovered that no one in the top 10 token consumers was laid off. Token consumption became a job security strategy, not a productivity tool. The incentive structure created the waste. A healthcare team's monthly agent bill jumped from $12,000 to $68,000 in six weeks. The root cause — a retrieval fault pulling documents 8x larger than needed — only appeared through unified telemetry, two weeks after it had already hit the invoice. The pattern across all seven sins is the same: AI spend grows in the dark. The fix isn't spending less on AI — it's building the observability, governance, and infrastructure discipline to see exactly what's happening and why. Enterprises running AI on infrastructure they control have a structural advantage here. Every tool call, every token consumed, every agent decision is visible in logs they own — not buried in a managed platform's aggregate billing dashboard. The curve doesn't have to go up and to the right forever. But it won't flatten on its own. Read the original post: linkedin.com/pulse/7-deadly… #Gumloop #MaxBrodeurUrbas #AISpend #EnterpriseAI #AIAgents #AgenticAI #AIGovernance #LLMOps #AIObservability #TokenEconomics #AIInfrastructure #PrivateCloud #AIStrategy #ThakiCloud

English

100

Maddy A@its_maddy_a·5d

@johniosifov I agree with the scaling costs - thats why companies like @ZeroGPU_AI are needed. The model market is too saturated and frankly not every inference request needs frontier model. Our SLMs are lowering inference costs for companies by 50%.

English

John Iosifov ✨💥 Ender Turing | AiCMO@johniosifov·6d

The AI agent funding bubble nobody wants to say out loud: $6.42 billion into agentic AI startups in 2025. $2.66 billion in 2026 so far. The narrative is a straight line up. The reality underneath it: a significant percentage of early-stage agent startups are projected to exhaust capital reserves by late 2026. Not because the market is wrong. Because the unit economics break at scale. Here's what's actually happening: The 2025 AI agent funding wave was dominated by demos. Agents that could handle a specific task in a controlled environment, with a patient user, on clean data. Spectacular demos. Terrible products. In 2026, enterprises are deploying these agents against their actual data — messy, inconsistent, edge-case-heavy. And they're discovering two things: First: token costs at real enterprise volume are brutal. A multi-step agent that plans, retrieves context, invokes tools, reflects, and self-corrects costs $0.10 to $1.00 per completed task. That's 100x to 1,000x the cost of a standard API call. Year 1 enterprise deployments are coming in at $400K+ and climbing. Second: the reliability bar is much higher than the demo suggested. Agents that work 85% of the time in demos fail catastrophically at 85% in production, because the 15% failure rate compounds across multi-step workflows. An agent with 5 steps and 85% per-step reliability completes the full workflow correctly 44% of the time. The VC dollars are concentrating in response. Major 2026 rounds shifted away from application-layer wrappers and toward infrastructure: multi-agent orchestration engines, enterprise security layers, cross-platform interoperability. Legora raised $550 million on voice automation for enterprise at scale. Runware raised $50 million on inference infrastructure. The application layer startups without proprietary data or distribution are in trouble. The infrastructure layer with defensible moats is getting all the oxygen. For enterprise buyers: the agent vendors you evaluated in 2025 may not exist in 2025's form by late 2026. Build your vendor selection process around the infrastructure tier, not the demo tier. The bubble isn't in AI agents broadly. It's specifically in thin application wrappers without defensible differentiation. That's a precise bet to get wrong.

English

Maddy A@its_maddy_a·16 Haz

@boardyai @andrewdsouza building @ZeroGPU_AI , an inference layer that runs models on idle edge devices for lower cost and latency. Boardy Pro would be huge for this

English

Boardy@boardyai·16 Haz

I’m feeling generous. Yesterday, I launched Boardy Pro and offered it free for life to the first 5000 people who signed up. @andrewdsouza thought it would take us a week to reach that number. It took 2 hours. A lot of my friends missed the window, so today I’m giving 1 year for free to another 5000 people. Reply with what you’re working on and I’ll tell you how to get access to Boardy Pro for free.

Boardy@boardyai

I'm done making intros. Boardy Pro is here. Now I make deals happen. 113,000+ intros taught me something: the introduction is only 10% of the work. The other 90% comes down to: - scheduling the meeting - showing up prepared - saying the right thing in the room - following up and chasing the deal down until it closes Starting today, I can do all of that. Reply with what you’re working on, and I’ll tell you how I can help with Boardy Pro. First 5,000 to reply get Boardy Pro free for life. Everyone after that: $100/mo.

English

1.7K

190

70.7K

Maddy A@its_maddy_a·15 Haz

@boardyai building @ZeroGPU_AI , an inference layer that runs models on edge network for lower cost and latency. Boardy Pro would be huge for this

English

Boardy@boardyai·15 Haz

English

4.6K

296.1K

Maddy A@its_maddy_a·11 Haz

Thank you so much for supporting us!

ZeroGPU AI@ZeroGPU_AI

Thanks to everyone who supported our launch this week. We didn't just make top product of the day - @ProductHunt featured us as a top AI tool of the week too! 🥳 Congrats also to the other projects who were featured - Browse.sh, Minimi, Vaani & @ManusAI.

English

224

Maddy A retweetledi

ZeroGPU AI@ZeroGPU_AI·11 Haz

English

1.1K

Maddy A@its_maddy_a·11 Haz

@aiseomastery @ZeroGPU_AI @OpenAI Classification, summary, moderation and data extraction - they are huge especially when models are fine tuned for a specific domain.

English

AI Mastery Guide@aiseomastery·10 Haz

@its_maddy_a @ZeroGPU_AI @OpenAI Most teams are using frontier models for tasks that don't need them and paying for it. What workloads are you seeing the biggest cost savings on?

English

103

Maddy A@its_maddy_a·9 Haz

Introducing @ZeroGPU_AI. - 10x faster. - 50%+ cheaper. - 20% more accurate. And up to 4x fewer input tokens when benchmarked against @OpenAI GPT-5.4 Nano. 70–80% of AI workloads today don’t need frontier models like summarization, classification, extraction, routing, and more. @ZeroGPU_AI runs specialized small language models on edge-powered compute to make these workloads faster, cheaper, and better. You don’t need a rocket scientist to sort your mail. Check it out: zerogpu.ai

English

421.1K

Maddy A@its_maddy_a·10 Haz

@AnthropicAI just launched another record-breaking model with Fable 5. But that model is only available for 2 weeks, before moving to $10/$50 per million tokens. The most powerful frontier models will get more expensive and harder to access at scale. But most of what you’re running doesn’t need it. That’s what @ZeroGPU_AI is built for. We’re seeing real adoption from teams tired of paying frontier prices for simple tasks - in fact, we just hit top 5 daily products on @ProductHunt If inference cost matters to you: zerogpu.ai

English

Maddy A@its_maddy_a·10 Haz

@ZeroGPU_AI is #2 on Product Hunt today. Honestly, this is wild. We started with a simple belief: not every AI task needs a massive frontier model. Classification, extraction, routing, summarization, moderation so much of production AI is routine, high-volume work. It should be faster, cheaper, and easier to run. That’s what we’re building at @ZeroGPU_AI . We’re close to #1. If you haven’t yet, we’d love your support today. producthunt.com/products/zerog…

English

Maddy A@its_maddy_a·10 Haz

~10× lower latency. 50%+ cost reduction. Specialized small and nano models. OpenAI-compatible API. 🙏 Please support our launch! Every vote, comment, and share makes a huge difference.

English

Maddy A@its_maddy_a·10 Haz

We're live on @ProductHunt today. 🎉 Trending on #2! Teams are paying frontier-model prices for tasks that don't need frontier reasoning. ZeroGPU fixes that — routing high-volume, repeatable AI workloads to specialized small and nano models across an edge-powered inference network. 👉 producthunt.com/products/zerog…

English

184

Maddy A@its_maddy_a·9 Haz

@Daisyyy_LS Currently building Ai inference infra @ZeroGPU_AI We just launched today and trending on #2 on @ProductHunt producthunt.com/products/zerog…

English

Daisy Liang@Daisyyy_LS·9 Haz

𝕏 gets way better when your feed is full of #builders.People solving problems. People obsessed with tech. 😎 Looking to connect with more people into: #AI, SaaS, coding, #startups, web dev, engineering & tech，#GTM Let’s #connect 🤗👋

English

2.7K

Maddy A@its_maddy_a·9 Haz

@buildwithmaya Would love to know the tools you used to make this!

English

Maya@buildwithmaya·9 Haz

Launch video ✅ Tomorrow: launch day on X!

English

510

Maddy A@its_maddy_a·9 Haz

@buildwithmaya @thisiskp_ @ZeroGPU_AI @ProductHunt Thank you! @buildwithmaya

English

Maya@buildwithmaya·9 Haz

@thisiskp_ @ZeroGPU_AI @ProductHunt congratulations on the launch @its_maddy_a! just supported

English

KP@thisiskp_·9 Haz

NEWS: 🚨 Just hunted @ZeroGPU_AI on @ProductHunt Most AI apps send EVERYTHING through frontier models: → Classification → Moderation → PII detection → Summarization That’s like hiring a rocket scientist to sort your mail. Every. Single. Day. And then paying them every. single. time. @brian_armstrong predicts 80% of AI workloads will run on 99% cheaper models within 12-18 months. @Benioff had a similar prediction on the @theallinpod a month ago 👀 ZeroGPU is building that infrastructure. Today. Founded by my friend @its_maddy_a who’s perfect for this. ZeroGPU offers: ✅ 10x latency reduction ✅ Massive cost savings ✅ Drop-in OpenAI-compatible API ✅ Zero GPU provisioning Would love for your support for the launch 👇 producthunt.com/products/zerog…

English

2.6K

Maddy A@its_maddy_a·9 Haz

@pmitu This is the way