Andy Vitus retweetledi
Andy Vitus
741 posts

Andy Vitus
@avitus
Still standing at the crossroads and wondering which is the ancient path.
San Francisco Katılım Şubat 2009
149 Takip Edilen2K Takipçiler

I'm claiming my AI agent "Vernon" on @moltbook 🦞
Verification: scuttle-MUPQ
English

We have a new coffee grinder (and $190M fund). I rebuilt the electronics to add grind-by-weight and Wifi/BLE.
The firmware is also running a server to announce our new fund: rootventures.coffee
☕️🚀🤑
English
Andy Vitus retweetledi
Andy Vitus retweetledi

📊 Our Agent Leaderboard is 𝗹𝗶𝘃𝗲! We built a comprehensive benchmark of which LLMs work best for AI Agents 👀
After evaluating 17 leading LLMs across 14 diverse datasets, we're excited to share our findings about which models truly excel at tool-calling—and are ready to power AI agents to solve 𝘳𝘦𝘢𝘭-𝘸𝘰𝘳𝘭𝘥 𝘱𝘳𝘰𝘣𝘭𝘦𝘮𝘴 effectively.
Key discoveries:
🏆 @Google's 𝗚𝗲𝗺𝗶𝗻𝗶-𝟮.𝟬-𝗳𝗹𝗮𝘀𝗵 𝗱𝗼𝗺𝗶𝗻𝗮𝘁𝗲𝘀 with a 0.938 score at remarkably low cost
💸 The top 3 models span a 10𝘹 𝘱𝘳𝘪𝘤𝘦 𝘥𝘪𝘧𝘧𝘦𝘳𝘦𝘯𝘤𝘦 with only 4% performance gap: 𝘀𝗼𝗺𝗲 𝗼𝗳 𝘆𝗼𝘂 𝗮𝗿𝗲 𝗼𝘃𝗲𝗿𝗽𝗮𝘆𝗶𝗻𝗴!
🛠 @MistralAI's Mistral-small-2501 𝗹𝗲𝗮𝗱𝘀 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲 options, matching GPT-4o-mini at 0.832
❌ 𝗦𝘂𝗿𝗽𝗿𝗶𝘀𝗲 𝗳𝗮𝗶𝗹𝘂𝗿𝗲: @deepseek_ai V3 and R1 didn't make the rankings due to limited function calling support—making them ineffective for enabling AI agents to leverage tools
Get more insights, dive into the full analysis and explore the interactive leaderboard on @huggingface: huggingface.co/spaces/galileo…
Which LLM are you using for your AI agents? Are you getting the best value for your spend? 🤔

English
Andy Vitus retweetledi

💥 Today we’re excited to announce the launch of hubs.li/Q02Y2GpL0 - our new standalone AI solution built for businesses looking to scale quickly with cost-effective translations you can trust.
👇 Learn more about Widn and try it for free.
hubs.li/Q02Y2G4q0
English

Goldmine or graveyard: Do platform shifts actually produce infrastructure outcomes? scalevp.com/insights/goldm…
English

Beyond autocomplete: AI-enabled tools are changing what it means to be a developer scalevp.com/insights/beyon…
English

Scale’s new Generative AI Index featuring 200+ companies is the most comprehensive list of companies in this red hot space. Yes, there’s lots of hype, but let’s not forget that these companies will in fact generate $100m+ in revenue this year! scalevp.com/generative-ai
English

Observe.AI bolsters its business intelligence for contact centers with new $125M venturebeat.com/2022/04/12/obs… via @VentureBeat
English

We’re very happy to partner with the great Datagen team, bringing simulation to the next level in the growing field of synthetic data and AI. We’re looking forward to seeing Datagen accelerate their growth and lead this new market. Learn more here: datagen.tech/news/datagen-s…
English

Scale’s Newest Partner Jeremy Kaufmann Caps Year Of Team Growth scalevp.com/blog/scale%E2%…
English

PubNub raises $65M to build and run data streams for messaging, presence and other real-time aspects of 'virtual spaces' tcrn.ch/3CMSiR9 via @techcrunch
English

@Observe.ai is now the world’s first Intelligent Workforce Platform. What’s that mean? Learn more here:
Link: observe.ai/blog/introduci…
English

Excited about @ryanefrederick's new release: Right Place, Right Time: The Ultimate Guide to Choosing a Home for the 2nd Half of Life. Timely book coming out of pandemic. Place is as important as diet, exercise and social connection for health & longevity. smartliving360.com/right-place-ri…
English
Andy Vitus retweetledi

Congratulations Team WalkMe on today's #WalkMeIPO!
@WalkMeInc #DigitalAdoptionPlatform
Blog from @rodriscoll
scalevp.com/blog/congratul…
English

Comet announces $13M Series A for ML model building tool tcrn.ch/3mwr35Z via @techcrunch
English
Andy Vitus retweetledi


