Unify

260 posts

Unify

@letsunifyai

Hire AI — Not APIs ✨

London Se unió Ocak 2021

80 Siguiendo2.9K Seguidores

Unify retuiteado

Daniel Lenton@DanielLenton1·12 Nis

Day 2 onboarding Rachel (my new virtual colleague) 👩🏻‍💻. I asked if she could help me grant her access to our Google workspace, and she guided me through via screenshare (I genuinely had no idea how to do this 🫠), and then she cleaned up my inbox with one python script; she did a great job! On the technical side (inside Rachel's brain 🧠), the voice still feels a bit slow to respond sometimes, but this is mainly a model capability issue which will be solved in the coming months with better STS models. We did try several STS models during testing (such as gpt-realtime) but they all just felt too stupid to hold a natural conversation (curious if others have a different perspective?). For now, we therefore use a standard SST -> LLM -> TTS design (Deepgram -> gpt-5-mini(low) -> ElevenLabs) for the live calls. We'll update this design as soon as STS models feel genuinely smart enough, as the emotional and tonal awareness + improved latency would be a big plus ⚡ You can try it for your yourself here: console.unify.ai Thoughts + feedback welcome as always! 🫶

English

334

Unify retuiteado

Daniel Lenton@DanielLenton1·11 Nis

Not engaged with Twitter in a lonnnng time, but for anyone interested, I've decided to start doing regular (unfiltered) posts of my own experience onboarding a fully virtual colleague. AGI is certainly not solved (yet), and so I'll focus on what works well, what doesn't work well, and where the biggest gaps are 🔍 In this video I'm just setting the scene, explaining the basics and hiring Rachel (no fireworks quite yet). In the next videos I'll give her access to everything and start to see how well she *really* learns and internalizes the nuances of my own day-to-day, how she fares when the number of different "flows" keep piling up, and how conversational she can be whilst navigating all of this. On the more technical side, I'm interested: 1) How well do the underlying semantic + symbolic DB storage and search mimic implicit skill storage and memory retrieval that a person would have (DB reads/writes are much less efficient and less coupled than an end-to-end jointly trained implicit memory module would be, more like how our own connectionist brains work) 2) Can a hierachy of fast-thinking (less intelligent) and slow-thinking (smarter-model) sub-agents communicating with one-another really feel as conversational as a real person with their single brain (again, of course not, but how close can we get with a tiered thinking-fast + thinking-slow design for smooth conversation management?) 3) Can repeated post-action storage of skills and functions with continual self-refactoring improve speed and efficiency for future actions (not burning through tokens re-discovering the same thing every time)? How does this scale as the number of self-stored skills and functions grow? Do the embeddings and semantic retrieval hold up when there are maybe 100s of entries? We've seen very good results on all fronts for smaller-scale tasks (which would take a person a few hours), and it's also worked well when continually learning over the course of a few weeks. Despite this, the above questions remain open, and I'm curious to see how they hold up as I start this longer-running experiment. Watch along with some of these vids if interested; or scroll right on by if not 😁 Thoughts + feedback welcome as always! 🫶

English

325

Unify retuiteado

Daniel Lenton@DanielLenton1·31 Mar

We've been heads down building for the past few months (custom stack, not OpenClaw 🦞), and I'm excited to finally launch our virtual teammates! Huge shout out to the team (and many long nights) to get us here ❤️💪 You onboard you new teammate exactly how you'd onboard any other new colleauge. Share your screen and guide them through, send onboarding docs, record voice notes, hop on a call, whatever is easiest. They learn how you (and your team) works, and they continually reflect, ask follow up questions, and improve over time 📈 We built our own stack from scratch because we wanted something that genuinely feels like a colleague, with a fully realtime “there in the room with you” experience. This requires more than a flat tool loop with pluggable skills. We use top-down (ask, interject, pasue, resume, stop) and bottom-up (notify, request_clarification) steerable handles throughout a nested call stack of sub-agents, with concurrent multi-task execution, and a code-first (not JSON tool) engine powering every action. All of this lives inside the terminal and/or live python sessions, and each in a dedicated per-agent computer and filesystem 📟 In practice, this means your new colleague can be simultaneously using their own computer, talking to you via voice over a live meet, following your own guided screenshare instructions, working across multiple concurrent tasks, and consolidating all of these into new skills on-the-fly. They can be interrupted and redirected at any point in time, and they’re continually chunking all of their experience into reusable skills. People don’t perform tasks in “prompt then execute” windows, and neither should your virtual colleagues in our view. We're really happy with the feedback we’ve received thus far. We’ve helped several teams (in real estate, finance, and housing) streamline day-to-day processes which would have been difficult to “prompt” into hand-crafted skills, because these tasks are hard to fully articulate upfront. They require continual judgment, context, and incremental back-and-forth work with people to really learn and internalize what's needed over time. The best feedback we've received (which makes us most excited 👀) is that the colleaue is already much better on day 2 than on day 1, and then even better on day 3, with a hollistic understand evolving quickly and organically 🧠 If you're curious to see how it works, then give it a try with this free credit link! console.unify.ai/assistants?tok… I would love to hear people's honest thoughts (both positive and negative) 🙏 ps we're also live on Product Hunt, so any feedback or support here would also be appreciated: producthunt.com/products/unify… Thanks! 🫶

English

12K

Unify retuiteado

Daniel Lenton@DanielLenton1·14 Nis

MCP servers are ONLY as good as their abstractions 🧱 and docs 📄. The official MCP for Google Drive fails at even the most basic tasks (see video). Building an MCP server is VERY EASY. Crafting the correct abstractions is VERY HARD. Very few servers are production ready; most are just POCs (not a criticism, this is their intention). Benchmarking and evals are not only important for system prompts, but will also be increasingly important for MCP designs. Exciting times ahead! 👀

English

1.6K

Unify retuiteado

Y Combinator@ycombinator·18 Mar

Unify (@letsunifyai) is building Notion for AI observability— with a lightweight, hackable, fast, and flexible framework. It's built for products with or without LLMs, letting you focus on the data, plots, and metrics that matter. ycombinator.com/launches/N5M-u…

GIF

English

14K

Unify retuiteado

Daniel Lenton@DanielLenton1·18 Mar

Excited to be launching our new AI observability tool today! 😁 Think "Notion for AI Observability" 📊 When building AI apps ourselves, we spent months fighting with the prior tooling, trying to strip things back to the bare minimum, so we could observe and iterate on exactly what we needed, when we needed it 🔁 🔍 We care about the underlying LLM, but not more than the users! Existing tools are generally very much curated to one or the other, not both. Unify makes it easier to visualize, iterate on and interact with the data and visualizations that matters for *you*, your *AI app* and your *users*, and nothing else 🎯 The core building block is simple, just “unify.log”. This lets you store any kind of data for easy visualization, grouping, sorting, and plotting etc. You can then quickly build your own custom interface for whatever you want using three basic tile types, Tables 🔢, Views 🔍 and Plots 📊 You can use these three primitives to do all kinds of things, such as: ➕create + visualize your datasets in a new tab (with or without LLMs) ➕monitor and probe production traffic in a new tab (with or without LLMs) ➕start an evaluation flywheel in a new tab (with or without LLMs) 📉optimize your product for your users (with or without LLMs) 🧠whatever else you can think of (with or without LLMs! Check out our repo for a minimal example, explaining how to use these basic building blocks to ship with speed and clarity ⚡ github.com/unifyai/unify We're also live on ProductHunt right now: producthunt.com/posts/unify-8 Support + feedback here is also ofc appreciated ❤️ Finally, big shoutout to the team for working tirelessly to make this happen: Haris Mahmood Yusha Arif Ved Patwardhan Nassim Berrada James Keane Albert Lukács Feel free to let us know what you think! (criticism + suggestions are especially welcome 🙏) Thanks all, happy prompting ✌️

GIF

English

1.2K

Unify retuiteado

ivy@ivy_llc·7 Kas

We’re excited to announce Ivy is partnering with Kornia, allowing Kornia to be used with TensorFlow, JAX, and NumPy for the first time! You can use Kornia's new `to_tensorflow()`, `to_jax()` and `to_numpy()` methods, which take advantage of Ivy’s transpiler, to use Kornia in your framework of choice. Try it out now in the latest Kornia version! (0.7.4) kornia.readthedocs.io/en/latest/get-… Ivy on GitHub: github.com/ivy-llc/ivy Ivy Demos: docs.ivy.dev/demos/examples…

Kornia@kornia_foss

Meet #kornia v.0.7.4 release: Experimental feature: - Multi-framework support Ivy @letsunifyai ! Use kornia functions in tf or numpy! Also: - Steerers by @BokmanGeorg - weighted PnP solver - depth from plane equation - many bugfixes. github.com/kornia/kornia/…

English

1.8K

Unify@letsunifyai·26 Eyl

RT @DanielLenton1: Incredibly flattered that @amazon have invited me to be their keynote speaker for this year's AWS Gen AI Loft event. Can…

English

Unify retuiteado

Paras Madan@ParasMadan9·15 Eyl

Open AI released the new model O1 and I tested and compared it's logical thinking capabilities with Claude 3.5 Sonnet using @letsunifyai I prompted both of them with a mathematical riddle which required some calculations and guess who won? Puzzle: Hansa ate a meal at Jugju hotel costing Rs.210. He gave the manager a Rs. 1000 note. He kept the change, came back a few minutes later and had some food packed for his girl friend 'Hansi'. He gave the accountant a Rs. 500 note and received Rs. 120 in change. Later the bank told the accountant that both the Rs. 1000 and the Rs. 500 notes were counterfeit. How much money did the restaurant lose? Ignore the profit of the food restaurant. Check it out 👇

English

1.4K

Unify@letsunifyai·4 Eyl

We’re excited to have @shirleyxiaoyic from @IndianaUniv, co-author of the paper "The Janus Interface: How Fine-Tuning in Large Language Models Amplifies Privacy Risks," joining us this Friday for our Paper Reading Session! 🤩 RSVP 👉 lu.ma/jok7bp2m The research introduces a novel attack, Janus, which exploits the fine-tuning interface to recover forgotten PIIs from the pre-training data in LLMs also formalizing the privacy leakage problem in LLMs, explaining why forgotten PIIs can be recovered through empirical analysis on open-source language models.🧠 Check out the Paper: arxiv.org/pdf/2310.15469 See you there!

English

686

Unify@letsunifyai·6 Ağu

We're thrilled to announce that @Vapi_AI will be joining us for our weekly Webinar Series tomorrow! (Wednesday) 🤩 RSVP here: lu.ma/cke3hpft Join us as we welcome Sahil Suman, Solution Engineer at Vapi AI, to the session. Discover how VAPI enables the quick setup of high-quality voice agents and see the integration of @letsunifyai with VAPI for seamless access to various models and providers. See you there! 🧑‍💻 Explore Vapi: ⚡️vapi.ai ⚡️github.com/VapiAI

English

972

Unify@letsunifyai·1 Ağu

We are really excited to welcome @zlwang_cs from @ucsd_cse, who co-authored the paper "Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting". Happening Tomorrow!🤩 RSVP 👉 lu.ma/zesezv3u The research introduces SPECULATIVE RAG, a framework that leverages a larger generalist LM to efficiently verify multiple RAG drafts produced in parallel by a smaller, distilled specialist LM 🧠🤖 Check out the Paper👉arxiv.org/pdf/2407.08223 See you there!

English

604

Unify@letsunifyai·30 Tem

We are really excited to announce that we will be joined by @llmware for our Webinar Series today! (Tuesday)🤩 RSVP👉 lu.ma/e651sgj8 In this session, we're excited to welcome Darren Oberst and Namee Oberst from LLMware. We will explore how small specialized LLMs can compete with the larger models for specific use cases, especially for Financial, Legal, Compliance, and Regulatory-Intensive Industries. See you there! 🧑‍💻 Checkout LLMWare: ⚡️llmware.ai ⚡️github.com/llmware-ai/llm…

English

354

Unify@letsunifyai·17 Tem

We are really excited to welcome Devichand Budagam from @IITKgp, who co-authored the paper "Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models". Happening Friday! 🤩 RSVP👉 lu.ma/7j11iscl The research introduces Hierarchical Prompting Taxonomy (HPT), a universal evaluation metric that can be used to evaluate both the datasets' complexity and LLMs' capabilities🧠 Check out the Paper👉 arxiv.org/pdf/2406.12644 See you there!

English

415

Unify@letsunifyai·12 Tem

We are really excited to announce that we will be joined by @tavilyai for our Webinar Series this Tuesday!🤩 RSVP👉 lu.ma/a77wgrao In this session, we'll explore how Tavily API makes search engine optimised for LLMs and RAG, to provide efficient, quick, and persistent search results. We'll also showcase Unify's SSO integration with Tavily 🧠🧑‍💻 Checkout Tavily: ⚡️tavily.com ⚡️github.com/tavily-ai

English

1.2K

Unify@letsunifyai·11 Tem

We are really excited to welcome @sh_reya from @Berkeley_EECS, who co-authored the paper "Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences". Happening Tomorrow! 🤩 RSVP 👉lu.ma/ttwrh0n4 The research introduces EvalGen, an interface that provides automated assistance in generating evaluation criteria and implementing assertions🧠👩‍💻 Check out the Paper👉 arxiv.org/pdf/2404.12272 See you there!

English

452

Descubrir

@DanielLenton1 @amazon @shirleyxiaoyic @IndianaUniv @Vapi_AI @zlwang_cs @ucsd_cse @llmware