Unify

260 posts

Unify

@letsunifyai

Hire AI — Not APIs ✨

London Beigetreten Ocak 2021

80 Folgt2.9K Follower

Unify retweetet

Daniel Lenton@DanielLenton1·12 Nis

Day 2 onboarding Rachel (my new virtual colleague) 👩🏻‍💻. I asked if she could help me grant her access to our Google workspace, and she guided me through via screenshare (I genuinely had no idea how to do this 🫠), and then she cleaned up my inbox with one python script; she did a

English

334

Unify retweetet

Daniel Lenton@DanielLenton1·11 Nis

Not engaged with Twitter in a lonnnng time, but for anyone interested, I've decided to start doing regular (unfiltered) posts of my own experience onboarding a fully virtual colleague. AGI is certainly not solved (yet), and so I'll focus on what works well, what doesn't work well, and where the biggest gaps are 🔍 In this video I'm just setting the scene, explaining the basics and hiring Rachel (no fireworks quite yet). In the next videos I'll give her access to everything and start to see how well she *really* learns and internalizes the nuances of my own day-to-day, how she fares when the number of different "flows" keep piling up, and how conversational she can be whilst navigating all of this. On the more technical side, I'm interested: 1) How well do the underlying semantic + symbolic DB storage and search mimic implicit skill storage and memory retrieval that a person would have (DB reads/writes are much less efficient and less coupled than an end-to-end jointly trained implicit memory module would be, more like how our own connectionist brains work) 2) Can a hierachy of fast-thinking (less intelligent) and slow-thinking (smarter-model) sub-agents communicating with one-another really feel as conversational as a real person with their single brain (again, of course not, but how close can we get with a tiered thinking-fast + thinking-slow design for smooth conversation management?) 3) Can repeated post-action storage of skills and functions with continual self-refactoring improve speed and efficiency for future actions (not burning through tokens re-discovering the same thing every time)? How does this scale as the number of self-stored skills and functions grow? Do the embeddings and semantic retrieval hold up when there are maybe 100s of entries? We've seen very good results on all fronts for smaller-scale tasks (which would take a person a few hours), and it's also worked well when continually learning over the course of a few weeks. Despite this, the above questions remain open, and I'm curious to see how they hold up as I start this longer-running experiment. Watch along with some of these vids if interested; or scroll right on by if not 😁 Thoughts + feedback welcome as always! 🫶

English

Unify retweetet

Daniel Lenton@DanielLenton1·31 Mar

We've been heads down building for the past few months (custom stack, not OpenClaw 🦞), and I'm excited to finally launch our virtual teammates! Huge shout out to the team (and many long nights) to get us here ❤️💪 You onboard you new teammate exactly how you'd onboard any other new colleauge. Share your screen and guide them through, send onboarding docs, record voice notes, hop on a call, whatever is easiest. They learn how you (and your team) works, and they continually reflect, ask follow up questions, and improve over time 📈 We built our own stack from scratch because we wanted something that genuinely feels like a colleague, with a fully realtime “there in the room with you” experience. This requires more than a flat tool loop with pluggable skills. We use top-down (ask, interject, pasue, resume, stop) and bottom-up (notify, request_clarification) steerable handles throughout a nested call stack of sub-agents, with concurrent multi-task execution, and a code-first (not JSON tool) engine powering every action. All of this lives inside the terminal and/or live python sessions, and each in a dedicated per-agent computer and filesystem 📟 In practice, this means your new colleague can be simultaneously using their own computer, talking to you via voice over a live meet, following your own guided screenshare instructions, working across multiple concurrent tasks, and consolidating all of these into new skills on-the-fly. They can be interrupted and redirected at any point in time, and they’re continually chunking all of their experience into reusable skills. People don’t perform tasks in “prompt then execute” windows, and neither should your virtual colleagues in our view. We're really happy with the feedback we’ve received thus far. We’ve helped several teams (in real estate, finance, and housing) streamline day-to-day processes which would have been difficult to “prompt” into hand-crafted skills, because these tasks are hard to fully articulate upfront. They require continual judgment, context, and incremental back-and-forth work with people to really learn and internalize what's needed over time. The best feedback we've received (which makes us most excited 👀) is that the colleaue is already much better on day 2 than on day 1, and then even better on day 3, with a hollistic understand evolving quickly and organically 🧠 If you're curious to see how it works, then give it a try with this free credit link! console.unify.ai/assistants?tok… I would love to hear people's honest thoughts (both positive and negative) 🙏 ps we're also live on Product Hunt, so any feedback or support here would also be appreciated: producthunt.com/products/unify… Thanks! 🫶

English

Unify retweetet

Daniel Lenton@DanielLenton1·14 Nis

MCP servers are ONLY as good as their abstractions 🧱 and docs 📄. The official MCP for Google Drive fails at even the most basic tasks (see video). Building an MCP server is VERY EASY. Crafting the correct abstractions is VERY HARD. Very few servers are production ready; most are just POCs (not a criticism, this is their intention). Benchmarking and evals are not only important for system prompts, but will also be increasingly important for MCP designs. Exciting times ahead! 👀

English

Unify retweetet

Y Combinator@ycombinator·18 Mar

Unify (@letsunifyai) is building Notion for AI observability— with a lightweight, hackable, fast, and flexible framework. It's built for products with or without LLMs, letting you focus on the data, plots, and metrics that matter. ycombinator.com/launches/N5M-u…

GIF

English

Unify retweetet

Daniel Lenton@DanielLenton1·18 Mar

Excited to be launching our new AI observability tool today! 😁 Think "Notion for AI Observability" 📊 When building AI apps ourselves, we spent months fighting with the prior tooling, trying to strip things back to the bare minimum, so we could observe and iterate on exactly

GIF

English

1.2K

Unify retweetet

ivy@ivy_llc·7 Kas

We’re excited to announce Ivy is partnering with Kornia, allowing Kornia to be used with TensorFlow, JAX, and NumPy for the first time! You can use Kornia's new `to_tensorflow()`, `to_jax()` and `to_numpy()` methods, which take advantage of Ivy’s transpiler, to use Kornia in

English

1.8K

Unify@letsunifyai·26 Eyl

RT @DanielLenton1: Incredibly flattered that @amazon have invited me to be their keynote speaker for this year's AWS Gen AI Loft event. Can…

English

Unify retweetet

Paras Madan@ParasMadan9·15 Eyl

Open AI released the new model O1 and I tested and compared it's logical thinking capabilities with Claude 3.5 Sonnet using @letsunifyai I prompted both of them with a mathematical riddle which required some calculations and guess who won? Puzzle: Hansa ate a meal at Jugju

English

1.4K

Unify@letsunifyai·4 Eyl

We’re excited to have @shirleyxiaoyic from @IndianaUniv, co-author of the paper "The Janus Interface: How Fine-Tuning in Large Language Models Amplifies Privacy Risks," joining us this Friday for our Paper Reading Session! 🤩 RSVP 👉 lu.ma/jok7bp2m The research introduces a novel attack, Janus, which exploits the fine-tuning interface to recover forgotten PIIs from the pre-training data in LLMs also formalizing the privacy leakage problem in LLMs, explaining why forgotten PIIs can be recovered through empirical analysis on open-source language models.🧠 Check out the Paper: arxiv.org/pdf/2310.15469 See you there!

English

686

Unify@letsunifyai·6 Ağu

We're thrilled to announce that @Vapi_AI will be joining us for our weekly Webinar Series tomorrow! (Wednesday) 🤩 RSVP here: lu.ma/cke3hpft Join us as we welcome Sahil Suman, Solution Engineer at Vapi AI, to the session. Discover how VAPI enables the quick setup of high-quality voice agents and see the integration of @letsunifyai with VAPI for seamless access to various models and providers. See you there! 🧑‍💻 Explore Vapi: ⚡️vapi.ai ⚡️github.com/VapiAI

English

972

Unify@letsunifyai·1 Ağu

We are really excited to welcome @zlwang_cs from @ucsd_cse, who co-authored the paper "Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting". Happening Tomorrow!🤩 RSVP 👉 lu.ma/zesezv3u The research introduces SPECULATIVE RAG, a framework that leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, distilled specialist LM 🧠🤖 Check out the Paper👉arxiv.org/pdf/2407.08223 See you there!

English

604

Unify@letsunifyai·30 Tem

We are really excited to announce that we will be joined by @llmware for our Webinar Series today! (Tuesday)🤩 RSVP👉 lu.ma/e651sgj8 In this session, we're excited to welcome Darren Oberst and Namee Oberst from LLMware. We will explore how small specialized LLMs can compete with the larger models for specific use cases, especially for Financial, Legal, Compliance, and Regulatory-Intensive Industries. See you there! 🧑‍💻 Checkout LLMWare: ⚡️llmware.ai ⚡️github.com/llmware-ai/llm…

English

354

Unify@letsunifyai·17 Tem

We are really excited to welcome Devichand Budagam from @IITKgp, who co-authored the paper "Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models". Happening Friday! 🤩 RSVP👉 lu.ma/7j11iscl The research introduces Hierarchical

English

415

Unify@letsunifyai·12 Tem

We are really excited to announce that we will be joined by @tavilyai for our Webinar Series this Tuesday!🤩 RSVP👉 lu.ma/a77wgrao In this session, we'll explore how Tavily API makes search engine optimised for LLMs and RAG, to provide efficient, quick, and persistent search results. We'll also showcase Unify's SSO integration with Tavily 🧠🧑‍💻 Checkout Tavily: ⚡️tavily.com ⚡️github.com/tavily-ai

English

1.2K

Unify@letsunifyai·11 Tem

We are really excited to welcome @sh_reya from @Berkeley_EECS, who co-authored the paper "Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences". Happening Tomorrow! 🤩 RSVP 👉lu.ma/ttwrh0n4 The research introduces EvalGen, an interface that provides automated assistance in generating evaluation criteria and implementing assertions🧠👩‍💻 Check out the Paper👉 arxiv.org/pdf/2404.12272 See you there!

English

452

Entdecken

@DanielLenton1 @amazon @shirleyxiaoyic @IndianaUniv @Vapi_AI @zlwang_cs @ucsd_cse @llmware