MLflow

1.2K posts

MLflow

@MLflow

The open source developer platform to build AI applications and models with confidence.

San Francisco, CA 参加日 Ağustos 2018

46 フォロー中11.2K フォロワー

固定されたツイート

MLflow@MLflow·23 Şub

🚀 𝗘𝘅𝗰𝗶𝘁𝗶𝗻𝗴 𝗡𝗲𝘄𝘀: 𝗠𝗟𝗳𝗹𝗼𝘄 𝟯.𝟭𝟬.𝟬 𝗶𝘀 𝗼𝗳𝗳𝗶𝗰𝗶𝗮𝗹𝗹𝘆 𝗵𝗲𝗿𝗲! The latest version of MLflow has arrived, bringing several new features designed to bridge the gap between experimental LLM development and production-grade operations. 𝗪𝗵𝗮𝘁’𝘀 𝗡𝗲𝘄 𝗶𝗻 𝟯.𝟭𝟬.𝟬: 🏢 Organization Support 💬 Multi-turn Evaluation & Conversation Simulation 💰 Trace Cost Tracking 🎯 Redesigned Navigation 📊 Gateway Usage Tracking ⚡ In-UI Trace Evaluation 🎮 Instant Demo Experiment The latest features are designed to reduce friction between development and deployment, ensuring your workflows are both efficient and scalable. Try and have a go at it. Let us know on GitHub of any issues. If you like the features in this release, give us a GitHub ⭐ 𝚙𝚒𝚙 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 𝚖𝚕𝚏𝚕𝚘𝚠==𝟹.𝟷𝟶.𝟶 Check out the full release notes and technical documentation! 👉 github.com/mlflow/mlflow/… #MLflow #MachineLearning #GenAI #LLMOps #LLM

English

1.4K

MLflow@MLflow·16h

In this video, Jules Damji breaks down: ✅ 𝗧𝗵𝗲 𝗣𝗼𝘄𝗲𝗿 𝗼𝗳 𝗧𝗿𝗮𝗰𝗲𝘀 𝗮𝗻𝗱 𝗦𝗽𝗮𝗻𝘀: Visualizing the hierarchy of chains, retrievers, and tools. ✅ 𝗔𝗜 𝗢𝗯𝘀𝗲𝗿𝘃𝗮𝗯𝗶𝗹𝗶𝘁𝘆: Using 𝚖𝚕𝚏𝚕𝚘𝚠.𝚘𝚙𝚎𝚗𝚊𝚒.𝚊𝚞𝚝𝚘𝚕𝚘𝚐() to capture latency and costs instantly and input and output parameters to the model, along with custom metadata and attributes ✅ 𝗧𝗵𝗲 𝗠𝗟𝗳𝗹𝗼𝘄 𝗔𝘀𝘀𝗶𝘀𝘁𝗮𝗻𝘁: How to use Claude to discover root cause analysis and debug MLflow traces (and fixing schema errors and API failures). ✅ 𝗖𝗼𝘀𝘁 𝗧𝗿𝗮𝗰𝗸𝗶𝗻𝗴: Mapping token usage to specific operations so you know exactly where your budget is going.

English

127

MLflow@MLflow·16h

Stop treating your LLM applications like a black box. ⬛ → 🔍 If you’ve ever had an agentic workflow fail and spent an hour digging through logs just to find which tool tripped up, this tutorial is for you. 🫵 We just dropped Part 3 of our 𝗠𝗟𝗳𝗹𝗼𝘄 𝗳𝗼𝗿 𝗚𝗲𝗻𝗔𝗜 series where we tackle 𝗔𝗜 𝗢𝗯𝘀𝗲𝗿𝘃𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗮𝗻𝗱 𝗧𝗿𝗮𝗰𝗶𝗻𝗴. ⬇️ 👀 Watch the full tutorial: youtube.com/watch?v=npiKuf… 📃 Tutorial 1.3: github.com/dmatrix/mlflow… #MLflow #GenAI #LLMOps #AIObservability

YouTube

English

774

MLflow@MLflow·1d

Agent frameworks get you to an initial version. They won't get you to the final reliability version. In our latest deep dive, the MLflow maintainers explain why stitching together separate tools for tracing, evaluation, and prompts creates a "fragile integration tax" that stalls the development of production AI. To move past "vibe-checking" prototypes and prevent $12,000 overnight invoices from haywire retry loops, your agents need a unified AI Platform. MLflow is the only open-source platform providing four integrated pillars for agents: 🔍 Observability ⚖️ Evaluation 📑 Version Control 🛡️ Governance Whether you are building with LangGraph, OpenAI Agents SDK, Pydantic AI, or CrewAI, MLflow provides the unified infrastructure to ship with confidence. Read more ➡️ mlflow.org/blog/agents-ne… #MLflow #AgenticAI #GenAI #Observability #VersionControl #Governance

English

372

MLflow@MLflow·3d

Starting in 𝗠𝗟𝗳𝗹𝗼𝘄 𝟯.𝟭𝟬.𝟬, you can now use Guardrails AI validators as native, deterministic GenAI scorers! 🚀 The TL;DR: ✅ 𝗗𝗲𝘁𝗲𝗿𝗺𝗶𝗻𝗶𝘀𝘁𝗶𝗰 & 𝗥𝗲𝗽𝗲𝗮𝘁𝗮𝗯𝗹𝗲: No more non-deterministic "judges" for PII or Secrets. ✅ 𝗡𝗮𝘁𝗶𝘃𝗲 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄: Use 𝚖𝚕𝚏𝚕𝚘𝚠.𝚐𝚎𝚗𝚊𝚒.𝚎𝚟𝚊𝚕𝚞𝚊𝚝𝚎() with the new 𝚐𝚞𝚊𝚛𝚍𝚛𝚊𝚒𝚕𝚜 scorer provider. ✅ 𝗖𝗜/𝗖𝗗 𝗥𝗲𝗮𝗱𝘆: Automated pass/fail gates for your LLM outputs. Read the full technical breakdown here ➡️ guardrailsai.com/blog/guardrail… #LLMOps #GenAI #MLflow #GuardrailsAI #LLM

English

1.4K

MLflow@MLflow·4d

📣 𝗕𝗶𝗴 𝗮𝗻𝗻𝗼𝘂𝗻𝗰𝗲𝗺𝗲𝗻𝘁: Genie Code for Agent Observability and Evaluation is here! 𝗟𝗲𝗮𝗿𝗻 𝗺𝗼𝗿𝗲 ➡️ docs.databricks.com/aws/en/mlflow3… 🚀 𝗚𝗲𝗻𝗶𝗲 𝗖𝗼𝗱𝗲 𝗳𝗼𝗿 𝗔𝗴𝗲𝗻𝘁 𝗢𝗯𝘀𝗲𝗿𝘃𝗮𝗯𝗶𝗹𝗶𝘁𝘆 & 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻 🚀 Getting started with agent observability and evaluation can be overwhelming, navigating new concepts like traces, scorers, and labeling sessions, each with their own UI/APIs and best practices. It's not always obvious to end users how to answer questions about their agent like "what went wrong?" or "what's the right way to fix an error?". Genie Code allows you to ask questions about your agent in natural language to get direct insights or runnable code to configure MLflow's various features, ranging from tracing & evaluations to prompt registry & review app. 🎯 𝗪𝗵𝗮𝘁 𝗰𝗮𝗻 𝗚𝗲𝗻𝗶𝗲 𝗖𝗼𝗱𝗲 𝗱𝗼? 🔍 Trace analysis & debugging: Investigate failing traces, pinpoint errors, and inspect inputs, outputs, and token consumption through conversation 📈 Performance metrics: Compute latency percentiles, track error rates, and analyze token usage just by asking 🧠 Quality & evaluation: Review evaluation scores, access scorers and datasets, and get help configuring evaluations 🔧 Runnable Code: Get snippets for nearly any feature in MLflow from instrumenting tracing to creating prompts or datasets to registering scorers 🙌 And much much more! 💬 𝗘𝘅𝗮𝗺𝗽𝗹𝗲 𝗾𝘂𝗲𝘀𝘁𝗶𝗼𝗻𝘀 "My agent is failing on tool calls. Can you look at the recent traces from today and tell me what's going wrong?" "What's the P95 latency for my agent over the past week? Is it getting worse?" "Which sessions had the poorest user feedback and why?" "How do I add tracing to my LangChain agent?" ✅ 𝗚𝗲𝘁 𝘀𝘁𝗮𝗿𝘁𝗲𝗱 Open Genie Code from the icon in the top-right while viewing an experiment and start asking questions! This is enabled across all workspaces/regions with Genie Code enabled. #mlflow #geniecode #agentobservability

English

517

MLflow@MLflow·13 Mar

The MLflow project maintainers and community contributors want to congratulate 🎉 Shivam Shinde for being the 1000th MLflow contributor. We welcome your contributions. Keep those PRs coming! MLflow has become one of the best-of-breed open-source platforms for AI engineering because of community contributions that keep us abreast of fast-paced AI innovations. Keep those PRs coming! 🚀

English

424

MLflow@MLflow·12 Mar

📈 New MLflow Tutorial: LLM Experiment Tracking & Cost Optimization In this video, Jules Damji shows how to move beyond ad-hoc testing by implementing systematic experiment tracking with MLflow. 🔹 Auto-instrument traces & tokens 🔹 Optimize costs across models 🔹 Compare hyperparameter impact 🎥 Watch the tutorial here: youtu.be/ykjYM3r0X8o?si… 🔗 Tutorial 1.1: github.com/dmatrix/mlflow… #MLflow #LLMOps #AIAgents #GenAI #AIObserverability @2twitme

YouTube

English

534

MLflow@MLflow·11 Mar

📣 Upcoming Webinar: Deep Dive into MLflow 3.11 Features for AI Observability and Quality Building on the MLflow 3.9 and 3.10 releases, MLflow 3.11 introduces several improvements for AI observability, AIOps, AI Governance, and the developer experience. Join us on March 25 for a technical deep dive into these new features, including: 🔹 Support for OpenTelemetry GenAI Semantic Convention 🔹 OpenCode Integration 🔹 Automatic Issue Identification from Traces 🔹 Gateway Budget Support 🗓️ March 25, 2026 🕒 4:00 PM PT 👇 Register today! luma.com/mlflow-webinar… #AIObservability #MLflow #AIGovernance #AIOps #GenAI

English

355

MLflow@MLflow·11 Mar

ICYMI: Coding agents like Claude Code are incredibly powerful, but for many engineers, they remain a "black box." 📦 Our latest guide shows you exactly how to close the observability gap using MLflow. 𝗟𝗲𝗮𝗿𝗻 𝗵𝗼𝘄 𝘁𝗼: ✅ Trace every tool call and latency metric with one command ✅ Use LLM judges to catch regressions before you ship ✅ Run Claude Code directly inside the MLflow UI Stop guessing what’s happening in your terminal and start shipping with confidence. 🚀 𝗖𝗵𝗲𝗰𝗸 𝗼𝘂𝘁 𝘁𝗵𝗲 𝗳𝘂𝗹𝗹 𝗴𝘂𝗶𝗱𝗲 𝘁𝗼 𝗴𝗲𝘁 𝗺𝗼𝗿𝗲 𝗼𝘂𝘁 𝗼𝗳 𝘆𝗼𝘂𝗿 𝗖𝗹𝗮𝘂𝗱𝗲 𝗖𝗼𝗱𝗲 𝘄𝗶𝘁𝗵 𝗠𝗟𝗳𝗹𝗼𝘄. ➡️ mlflow.org/blog/mlflow-cl… #MLflow #ClaudeCode #LLMOps #Observability

English

551

MLflow@MLflow·10 Mar

Stop guessing why your agents are failing. 🛑 In this video, MLflow Ambassador @iPandeyRahul builds a complete multi-agent school system from scratch using LangGraph + MLflow, adding one feature at a time. 🎥 Dive in: youtube.com/watch?v=mVbQXx… #opensource #mlflow #langgraph #llmops

YouTube

Rahul Pandey@iPandeyRahul

Is your AI Agent is a Black Box? Here is the fix @MLflow youtube.com/watch?v=mVbQXx…

English

663

MLflow@MLflow·10 Mar

🚀 New LLM-as-judges Integration: TruLens Trace Evaluation in MLflow We are excited to announce the TruLens integration for MLflow! This expands our third-party scorer framework, which already supports DeepEval, RAGAS, and Phoenix—an ecosystem with 32M+ monthly PyPI downloads. Developed by the TruLens team at @Snowflake, this integration adds 10 scorers to MLflow that analyze the full span tree: 🔹 𝗚𝗼𝗮𝗹-𝗣𝗹𝗮𝗻 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁: Evaluates strategy and tool selection. 🔹 𝗣𝗹𝗮𝗻-𝗔𝗰𝘁𝗶𝗼𝗻 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁: Checks for plan adherence and valid tool calling. 🔹 𝗛𝗼𝗹𝗶𝘀𝘁𝗶𝗰 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁: Grades logical consistency and execution efficiency. 🔗 Full technical details: mlflow.org/blog/mlflow-tr… #MLflow #TruLens #MLOps #LLM #GenAI #LLMOps #AIObserveability

English

1.2K

MLflow@MLflow·10 Mar

MLflow Skills transforms your coding agent into a high-velocity #LLMOps engine. It doesn't just write syntax—it traces, scores, and verifies your agent in a tight loop to catch regressions automatically. Stop shipping just code. Start shipping a robust library of traces and automated safeguards that make every future iteration faster. ⚡️ #MLflow #GenAI #AgenticAI #OpenSource

The Linux Foundation@linuxfoundation

What if your coding agent could fix itself? A new package teaches agents like #ClaudeCode, #Codex, and #GeminiCLI to trace, analyze, score, and improve LLM outputs using @MLflow evaluation infrastructure — with no manual evaluation code required. mlflow.org/blog/self-impr…

English

939

MLflow@MLflow·7 Mar

Ready to move from basic prompting to agent engineering? 🚀 In this video, Jules Damji (@databricks) breaks down the architectural pillars of the MLflow GenAI platform. What you’ll learn: 🔹 𝗧𝗵𝗲 𝟰 𝗣𝗶𝗹𝗹𝗮𝗿𝘀: Deep dive into Tracing, Evaluation, Prompt Registry, and AI Gateway. 🔹 𝗖𝗹𝗲𝗮𝗻 𝗦𝗲𝘁𝘂𝗽: Environment isolation using uv and secure credential management. 🔹 𝗧𝗵𝗲 𝗙𝗶𝗿𝘀𝘁 𝗥𝘂𝗻: Initializing a local tracking server and logging your first experiment. Stop guessing and start measuring. Set your foundation for observability and evaluation from day one. 🎥 Watch the full tutorial: youtu.be/IzUDKJlDo7Q 🔗 Tutorial 1.1: github.com/dmatrix/mlflow… #MLflow #AIAgents #LLMOps #GenAI #Python #AIObservability

YouTube

English

3.4K

MLflow@MLflow·6 Mar

Most RAG systems fail because teams rely on "vibe checks" instead of metrics. If your retriever pulls the wrong context, your agent will confidently hallucinate. 📉 Whether you use @pinecone, @databricks Vector Search, or pgvector, tuning knobs like chunk size and rerankers are too criticalto leave to "vibe checks." Our latest guide breaks down a systematic 𝗥𝗔𝗚 𝗯𝗲𝗻𝗰𝗵𝗺𝗮𝗿𝗸𝗶𝗻𝗴 𝘄𝗼𝗿𝗸𝗳𝗹𝗼𝘄 using MLflow: 🔹 𝗜𝘀𝗼𝗹𝗮𝘁𝗲 𝗩𝗮𝗿𝗶𝗮𝗯𝗹𝗲𝘀: Tune one knob at a time (Hybrid search, embeddings, etc.). 🔹 𝗦𝘁𝗮𝗻𝗱𝗮𝗿𝗱𝗶𝘇𝗲 𝗠𝗲𝘁𝗿𝗶𝗰𝘀: Move beyond manual queries to 𝚙𝚛𝚎𝚌𝚒𝚜𝚒𝚘𝚗@𝚔, 𝚛𝚎𝚌𝚊𝚕𝚕@𝚔, and 𝚗𝙳𝙲𝙶@𝚔. 🔹 𝗖𝗲𝗻𝘁𝗿𝗮𝗹𝗶𝘇𝗲 𝗥𝗲𝘀𝘂𝗹𝘁𝘀: Use the MLflow Experiment UI to compare configurations side-by-side. Stop flying blind. Let the metrics decide your architecture. 🚀 Benchmark your way to better RAG ➡️ mlflow.org/blog/tune-and-… #LLMOps #AIEngineering #MLflow #AIObservability #RAG

English

494

MLflow@MLflow·5 Mar

LY Corporation successfully integrated MLflow as a core pillar of their internal AI platform using @kubernetesio. 🙌 Their blueprint for high-traffic, high-security environments: 🔐 𝗡𝗼𝗻-𝗶𝗻𝘃𝗮𝘀𝗶𝘃𝗲 𝗮𝘂𝘁𝗵: Sidecar proxies keep the OSS core clean for easy upgrades 📡 𝗠𝗮𝗰𝗵𝗶𝗻𝗲-𝘁𝗼-𝗺𝗮𝗰𝗵𝗶𝗻𝗲 𝗮𝘂𝘁𝗵𝗼𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻 & 𝗮𝘂𝘁𝗵𝗲𝗻𝘁𝗶𝗰𝗮𝘁𝗶𝗼𝗻: mTLS (SPIFFE) gives training pods transparent identity ⚖️ 𝗦𝗰𝗮𝗹𝗮𝗯𝗹𝗲 𝗶𝘀𝗼𝗹𝗮𝘁𝗶𝗼𝗻: Dedicated instances per service ensure RBAC without "shadow IT" As ML moves from "toy projects" to business-critical infra, LY Corp proves a "Golden Path" makes high-security MLOps sustainable with MLflow. 🚀 🔗 𝗥𝗲𝗮𝗱 𝘁𝗵𝗲 𝗳𝘂𝗹𝗹 𝘁𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗱𝗲𝗲𝗽 𝗱𝗶𝘃𝗲: mlflow.org/blog/ly-corpor… #MLflow #GenAI #LLMOps #AIAgents #Kubernetes

English

590

MLflow@MLflow·4 Mar

⭐ Star us on GitHub: github.com/mlflow/mlflow

English

184

MLflow@MLflow·4 Mar

Stop paying the "Integration Tax" on your GenAI stack. 🏗️ 𝗠𝗟𝗳𝗹𝗼𝘄 𝗔𝗜 𝗚𝗮𝘁𝗲𝘄𝗮𝘆 is now built directly into the Tracking Server. No more "glue code" between your gateway, tracing, and eval tools. One unified platform for: 🔹 𝗦𝗶𝗻𝗴𝗹𝗲 𝗢𝗽𝗲𝗻𝗔𝗜-𝗰𝗼𝗺𝗽𝗮𝘁𝗶𝗯𝗹𝗲 𝗲𝗻𝗱𝗽𝗼𝗶𝗻𝘁 for every provider (OpenAI, Anthropic, Gemini, Bedrock, Azure, Cohere, and more) 🔹 Every request 𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗰𝗮𝗹𝗹𝘆 𝗯𝗲𝗰𝗼𝗺𝗲𝘀 𝗮𝗻 𝗠𝗟𝗳𝗹𝗼𝘄 𝘁𝗿𝗮𝗰𝗲 — no extra SDK needed 🔹 𝗧𝗿𝗮𝗳𝗳𝗶𝗰 𝘀𝗽𝗹𝗶𝘁𝘁𝗶𝗻𝗴 for A/B testing and fallback chains for reliability 🔹 𝗨𝘀𝗮𝗴𝗲 𝗱𝗮𝘀𝗵𝗯𝗼𝗮𝗿𝗱 with request volume, latency percentiles, token consumption, and cost breakdown 🔹 𝗖𝗿𝗲𝗱𝗲𝗻𝘁𝗶𝗮𝗹𝘀 𝘀𝘁𝗼𝗿𝗲𝗱 𝗲𝗻𝗰𝗿𝘆𝗽𝘁𝗲𝗱 on the server, never exposed to clients ⚡ Get started: 𝚙𝚒𝚙 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 '𝚖𝚕𝚏𝚕𝚘𝚠[𝚐𝚎𝚗𝚊𝚒]' 𝚖𝚕𝚏𝚕𝚘𝚠 𝚜𝚎𝚛𝚟𝚎𝚛 🔗 Full breakdown: mlflow.org/blog/mlflow-ai… #MLflow #LLMOps #GenAI #OpenSource

English

804

MLflow@MLflow·4 Mar

Back in January, the MLflow team sat down with @mlopscommunity to discuss why MLflow is being rebuilt for the "AI Engineer" era. As more teams move toward autonomous agents, this conversation is more relevant than ever. The highlights: 🔹 𝗧𝗵𝗲 𝗚𝗲𝗻𝗔𝗜 𝗣𝗶𝘃𝗼𝘁: Why MLflow is being rebuilt for agents and real production systems. 🔹 𝗧𝗵𝗲 𝗠𝗲𝘀𝘀𝘆 𝗥𝗲𝗮𝗹𝗶𝘁𝘆: Tackling evals, risky memory management, and governance that actually works. 🔹 𝗧𝗵𝗲 𝗙𝘂𝘁𝘂𝗿𝗲: Why MLflow remains the leading open-source standard for the next generation of AI. Don't build the next generation of AI on a legacy stack. 📺 Watch: home.mlops.community/public/videos/… 🎧 Listen: open.spotify.com/episode/1UOyLj… #MLflow #GenAI #LLMOps #AgenticAI

English

360

MLflow@MLflow·3 Mar

If coding agents can double your productivity for writing software, why aren't they doing the same for 𝗔𝗴𝗲𝗻𝘁 𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁? 🤖 The bottleneck is context. Coding assistants don't natively know how to evaluate your RAG pipeline, complex multi-step agentic workflow, or debug a tool-call failure—until now. MLflow Skills + Coding Agents = The ultimate development loop: 📡 𝗧𝗿𝗮𝗰𝗲: Auto-log every call without manual instrumentation. 🔍 𝗔𝗻𝗮𝗹𝘆𝘇𝗲: Let the agent find the root cause of hallucinations, other bottlenecks, and issues. ⚖️ 𝗦𝗰𝗼𝗿𝗲: Automatically generate LLM-as-a-Judge evaluators. 🛠️ 𝗩𝗲𝗿𝗶𝗳𝘆: Fix code and prove it works with real data. 🔗 Ship higher quality agents, faster: mlflow.org/blog/self-impr… #MLflow #Agents #CodingAgents #LLM #AgentLoop

English

539

MLflow@MLflow·3 Mar

@ravidsinghbiz @2twitme Four more videos will be posted soon. They will be released weekly. 😃

English

Ravi D. Singh@ravidsinghbiz·3 Mar

@MLflow @2twitme Good to know. However before watching the “Mastering GenAI Development with MLFlow” series, I want to complete the other YouTube series “Getting Started with MLFlow”. It is currently 3 videos. Can you tell me how many videos left in that series?

English

MLflow@MLflow·26 Şub

🚀 New Series: Mastering the AI Agent Lifecycle with MLflow Building a prototype is easy. Moving a tool-calling AI agent into production is a different story. In this new series, Jules Damji walks through the end-to-end lifecycle of AI agents using MLflow. Whether you're starting from scratch or optimizing an existing pipeline, this roadmap takes you from initial environment setup to tracking and tracing, observability and evaluation, and a final RAG project. Move beyond the prompt. Master the lifecycle. 🎥 Start the series here: youtu.be/2XAa6zuyU6w?si… #MLflow #AIAgents #LLMOps #GenAI #AIObserveability

YouTube

English

664

ディスカバー

@2twitme @iPandeyRahul @Snowflake @databricks @pinecone @kubernetesio @mlopscommunity @ravidsinghbiz