Unify me-retweet
Unify
260 posts

Unify me-retweet
Unify me-retweet

We've been heads down building for the past few months (custom stack, not OpenClaw 🦞), and I'm excited to finally launch our virtual teammates! Huge shout out to the team (and many long nights) to get us here ❤️💪
You onboard you new teammate exactly how you'd onboard any other new colleauge. Share your screen and guide them through, send onboarding docs, record voice notes, hop on a call, whatever is easiest. They learn how you (and your team) works, and they continually reflect, ask follow up questions, and improve over time 📈
We built our own stack from scratch because we wanted something that genuinely feels like a colleague, with a fully realtime “there in the room with you” experience. This requires more than a flat tool loop with pluggable skills. We use top-down (ask, interject, pasue, resume, stop) and bottom-up (notify, request_clarification) steerable handles throughout a nested call stack of sub-agents, with concurrent multi-task execution, and a code-first (not JSON tool) engine powering every action. All of this lives inside the terminal and/or live python sessions, and each in a dedicated per-agent computer and filesystem 📟
In practice, this means your new colleague can be simultaneously using their own computer, talking to you via voice over a live meet, following your own guided screenshare instructions, working across multiple concurrent tasks, and consolidating all of these into new skills on-the-fly. They can be interrupted and redirected at any point in time, and they’re continually chunking all of their experience into reusable skills. People don’t perform tasks in “prompt then execute” windows, and neither should your virtual colleagues in our view.
We're really happy with the feedback we’ve received thus far. We’ve helped several teams (in real estate, finance, and housing) streamline day-to-day processes which would have been difficult to “prompt” into hand-crafted skills, because these tasks are hard to fully articulate upfront. They require continual judgment, context, and incremental back-and-forth work with people to really learn and internalize what's needed over time.
The best feedback we've received (which makes us most excited 👀) is that the colleaue is already much better on day 2 than on day 1, and then even better on day 3, with a hollistic understand evolving quickly and organically 🧠
If you're curious to see how it works, then give it a try with this free credit link!
console.unify.ai/assistants?tok…
I would love to hear people's honest thoughts (both positive and negative) 🙏
ps we're also live on Product Hunt, so any feedback or support here would also be appreciated: producthunt.com/products/unify…
Thanks! 🫶
English
Unify me-retweet

MCP servers are ONLY as good as their abstractions 🧱 and docs 📄. The official MCP for Google Drive fails at even the most basic tasks (see video). Building an MCP server is VERY EASY. Crafting the correct abstractions is VERY HARD. Very few servers are production ready; most are just POCs (not a criticism, this is their intention). Benchmarking and evals are not only important for system prompts, but will also be increasingly important for MCP designs. Exciting times ahead! 👀
English
Unify me-retweet

Unify (@letsunifyai) is building Notion for AI observability— with a lightweight, hackable, fast, and flexible framework.
It's built for products with or without LLMs, letting you focus on the data, plots, and metrics that matter.
ycombinator.com/launches/N5M-u…
GIF
English
Unify me-retweet

Excited to be launching our new AI observability tool today! 😁 Think "Notion for AI Observability" 📊
When building AI apps ourselves, we spent months fighting with the prior tooling, trying to strip things back to the bare minimum, so we could observe and iterate on exactly what we needed, when we needed it 🔁 🔍
We care about the underlying LLM, but not more than the users! Existing tools are generally very much curated to one or the other, not both.
Unify makes it easier to visualize, iterate on and interact with the data and visualizations that matters for *you*, your *AI app* and your *users*, and nothing else 🎯
The core building block is simple, just “unify.log”. This lets you store any kind of data for easy visualization, grouping, sorting, and plotting etc. You can then quickly build your own custom interface for whatever you want using three basic tile types, Tables 🔢, Views 🔍 and Plots 📊
You can use these three primitives to do all kinds of things, such as:
➕create + visualize your datasets in a new tab (with or without LLMs)
➕monitor and probe production traffic in a new tab (with or without LLMs)
➕start an evaluation flywheel in a new tab (with or without LLMs)
📉optimize your product for your users (with or without LLMs)
🧠whatever else you can think of (with or without LLMs!
Check out our repo for a minimal example, explaining how to use these basic building blocks to ship with speed and clarity ⚡
github.com/unifyai/unify
We're also live on ProductHunt right now:
producthunt.com/posts/unify-8
Support + feedback here is also ofc appreciated ❤️
Finally, big shoutout to the team for working tirelessly to make this happen:
Haris Mahmood
Yusha Arif
Ved Patwardhan
Nassim Berrada
James Keane
Albert Lukács
Feel free to let us know what you think! (criticism + suggestions are especially welcome 🙏)
Thanks all, happy prompting ✌️
GIF
English
Unify me-retweet

RT @DanielLenton1: Incredibly flattered that @amazon have invited me to be their keynote speaker for this year's AWS Gen AI Loft event. Can…
English
Unify me-retweet

Open AI released the new model O1 and I tested and compared it's logical thinking capabilities with Claude 3.5 Sonnet using @letsunifyai
I prompted both of them with a mathematical riddle which required some calculations and guess who won?
Puzzle: Hansa ate a meal at Jugju hotel costing Rs.210. He gave the manager a Rs. 1000 note. He kept the change, came back a few minutes later and had some food packed for his girl friend 'Hansi'. He gave the accountant a Rs. 500 note and received Rs. 120 in change. Later the bank told the accountant that both the Rs. 1000 and the Rs. 500 notes were counterfeit. How much money did the restaurant lose? Ignore the profit of the food restaurant.
Check it out 👇
English

We’re excited to have @shirleyxiaoyic from @IndianaUniv, co-author of the paper "The Janus Interface: How Fine-Tuning in Large Language Models Amplifies Privacy Risks," joining us this Friday for our Paper Reading Session! 🤩
RSVP 👉 lu.ma/jok7bp2m
The research introduces a novel attack, Janus, which exploits the fine-tuning interface to recover forgotten PIIs from the pre-training data in LLMs also formalizing the privacy leakage problem in LLMs, explaining why forgotten PIIs can be recovered through empirical analysis on open-source language models.🧠
Check out the Paper: arxiv.org/pdf/2310.15469
See you there!

English

We're thrilled to announce that @Vapi_AI will be joining us for our weekly Webinar Series tomorrow! (Wednesday) 🤩
RSVP here: lu.ma/cke3hpft
Join us as we welcome Sahil Suman, Solution Engineer at Vapi AI, to the session. Discover how VAPI enables the quick setup of high-quality voice agents and see the integration of @letsunifyai with VAPI for seamless access to various models and providers. See you there! 🧑💻
Explore Vapi:
⚡️vapi.ai
⚡️github.com/VapiAI

English

We are really excited to welcome @zlwang_cs from @ucsd_cse, who co-authored the paper "Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting". Happening Tomorrow!🤩
RSVP 👉 lu.ma/zesezv3u
The research introduces SPECULATIVE RAG, a framework that leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, distilled specialist LM 🧠🤖
Check out the Paper👉arxiv.org/pdf/2407.08223
See you there!

English

We are really excited to announce that we will be joined by @llmware for our Webinar Series today! (Tuesday)🤩
RSVP👉 lu.ma/e651sgj8
In this session, we're excited to welcome Darren Oberst and Namee Oberst from LLMware. We will explore how small specialized LLMs can compete with the larger models for specific use cases, especially for Financial, Legal, Compliance, and Regulatory-Intensive Industries. See you there! 🧑💻
Checkout LLMWare:
⚡️llmware.ai
⚡️github.com/llmware-ai/llm…

English

We are really excited to welcome Devichand Budagam
from @IITKgp, who co-authored the paper "Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models". Happening Friday! 🤩
RSVP👉 lu.ma/7j11iscl
The research introduces Hierarchical Prompting Taxonomy (HPT), a universal evaluation metric that can be used to evaluate both the datasets' complexity and LLMs' capabilities🧠
Check out the Paper👉 arxiv.org/pdf/2406.12644
See you there!

English

We are really excited to announce that we will be joined by @tavilyai for our Webinar Series this Tuesday!🤩
RSVP👉 lu.ma/a77wgrao
In this session, we'll explore how Tavily API makes search engine optimised for LLMs and RAG, to provide efficient, quick, and persistent search results. We'll also showcase Unify's SSO integration with Tavily 🧠🧑💻
Checkout Tavily:
⚡️tavily.com
⚡️github.com/tavily-ai

English

We are really excited to welcome @sh_reya from @Berkeley_EECS, who co-authored the paper "Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences". Happening Tomorrow! 🤩
RSVP 👉lu.ma/ttwrh0n4
The research introduces EvalGen, an interface that provides automated assistance in generating evaluation criteria and implementing assertions🧠👩💻
Check out the Paper👉 arxiv.org/pdf/2404.12272
See you there!

English

