4.1K posts

K

@KrisFromDValley

Katılım Mart 2022

521 Takip Edilen326 Takipçiler

K retweetledi

Good morning! Capital Revival RPR #NFTs, drawn by Master Artist Sumi Stik are now minted and live! ▪️Grab yours now! As a bonus utility, there will be a raffle for all the holders of this series later! Let the PHOENIX RISE! 🔥 ▪️Buy here: bithomp.com/en/nft-explore…

English

1.1K

K retweetledi

Argus⚒️@ArgusForge·4d

@RodmanAi Sentinel runs 9 skills, 14 CLAUDE.md files, and hooks across the entire stack. This infrastructure pattern is what let one engineer ship a 472K+ node graph with autonomous agent governance. Solo builder, no team. The structure does the work!

English

206

K retweetledi

Argus⚒️@ArgusForge·4d

This is literally how SMELT's self-improvement loop works! After any correction, the system writes rules for itself that prevent the same mistake. Lessons get reviewed at session start. Mistake rate drops over time. We didn't copy neuroscience. We arrived at the same architecture independently. Spaced retrieval, teaching as verification, feed-forward over feedback. Built into the agent governance layer.

English

644

K retweetledi

Argus⚒️@ArgusForge·4d

Memory contamination is a governance problem, not a model problem. SMELT treats the knowledge graph as ground truth and agents as disposable. Score: 8/8 governance properties. Every other framework scores 0/8. This paper shows why! 👀

God of Prompt@godofprompt

🚨 BREAKING: Pennsylvania State University just found the hidden flaw killing every AI agent memory system. > Memory built from one model's traces gets contaminated with that model's biases, shortcuts, and reasoning quirks. Transfer it to any other model and performance falls below zero-memory baseline. > The fix: make two models solve the same problem. Extract only what survived across both. Llama 3 8B jumps from 27.4% to 42.4%. > Every agent memory system in production works the same way. The model solves problems. The memory stores what worked. The model retrieves those memories later and reasons better. The assumption buried inside this design: the stored knowledge is about the task, not about the model that solved it. > Pennsylvania State University tested whether that assumption holds. They gave a 7B model's memory to a 32B model. Performance dropped from 63.8% to 50.6% on MATH500, and from 68.3% to 34.1% on HumanEval. > Then they gave the 32B model's memory to the 7B model. Performance dropped again MATH500 fell from 52.2% to 50.6%, HumanEval from 42.7% to 34.1%. Both directions failed. Both fell below the zero-memory baseline. > The reason is structural. A model's reasoning traces don't just capture what the correct answer required. They capture how that specific model thinks its preferred solving strategies, its heuristic shortcuts, its stylistic patterns. Memory distilled from those traces encodes the model's reasoning personality alongside the actual task knowledge. When a different model retrieves that memory, it gets handed instructions optimized for a completely different cognitive architecture. The guidance actively interferes. > MEMCOLLAB fixes this by making the memory construction itself cross-model. Two agents a smaller and a larger model independently solve the same problem. One trajectory succeeds. One fails. The system contrasts them at the structural reasoning level: what reasoning principle was present in the successful trajectory and violated in the failed one? What error pattern appeared in the failure that the success avoided? The extracted memory stores only those abstract invariants not the solution, not the reasoning style, not the model-specific heuristics. Just the rule that held across both. → 7B model with 32B's memory: MATH500 drops from 52.2% to 50.6%, HumanEval drops from 42.7% to 34.1% → 32B model with 7B's memory: consistent degradation across benchmarks → MEMCOLLAB on Llama 3 8B: MATH500 jumps from 27.4% to 42.4%, average across four benchmarks from 41.7% to 53.9% → MEMCOLLAB on Qwen 7B: MATH500 from 52.2% to 67.0%, HumanEval from 42.7% to 74.4% → Inference efficiency: average reasoning turns drop from 3.3 to 1.5 on HumanEval, 3.1 to 1.4 on MBPP → Cross-architecture memory construction (Qwen 32B + Llama 8B) outperforms same-family construction on GSM8K: 95.2% vs 93.6% The efficiency finding is the one that gets overlooked. MEMCOLLAB doesn't just improve accuracy it makes agents reach correct answers in fewer steps. HumanEval reasoning turns cut from 3.3 to 1.5. MBPP from 3.1 to 1.4. The contrastive memory isn't adding more guidance. It's stripping out the noise that was making agents explore dead ends repeatedly. By encoding what not to do as explicitly as what to do, the memory prunes the search space before the agent even starts.

English

K retweetledi

Argus⚒️@ArgusForge·4d

Karpathy proposed autoresearch. Tobi Lütke is now contributing to it. An autonomous loop: try an idea, measure it, keep what works, revert what doesn't, repeat forever. Two flat files for full session continuity. We built this pattern independently into Sentinel's QA layer. Autonomous validation, regression detection, automatic revert. We call it Crucible. The convergence is the signal. Everyone keeps independently discovering the same agent architecture. github.com/davebcn87/pi-a…

English

K retweetledi

Argus⚒️@ArgusForge·3d

RPI and IBM Research just formalized LLM agent workflows as "agentic computation graphs." Static templates, dynamic routing, execution traces. SMELT has been running this architecture for months: 9 agents, hybrid routing, IRONCLAD provenance. We built it. They named it.

elvis@omarsar0

NEW research from IBM: Workflow Optimization for LLM Agents. LLM agent workflows involve interleaving model calls, retrieval, tool use, code execution, memory updates, and verification. How you wire these together matters more than most teams realize. This new survey maps the full landscape. It categorizes approaches along three dimensions: when structure is determined (static templates vs. dynamic runtime graphs), which components get optimized, and what signals guide the optimization (task metrics, verifier feedback, preferences, or trace-derived insights). It proposes structure-aware evaluation incorporating graph properties, execution cost, robustness, and structural variation. Most teams either hardcode their agent workflows or let them be fully dynamic with no principled middle ground. This survey provides a unified vocabulary and framework for deciding where your system should sit on the static-to-dynamic spectrum. Paper: arxiv.org/abs/2603.22386 Learn to build effective AI agents in our academy: academy.dair.ai

English

K retweetledi

Argus⚒️@ArgusForge·4d

472,621+ nodes. Three branches of government. 15 federal databases. Eight autonomous agents with full governance. Neo4j is the backbone. GraphRAG is not theoretical for us!

Neo4j@neo4j

Do you know what Agentic AI Architecture is and why it matters? The Agentic AI architecture enables agents to plan, act, use tools, collaborate, and operate safely in production. As systems scale, architecture becomes critical. #AgenticAI depends on context, memory, and reasoning. Knowledge graphs provide connected, structured context, enabling traceable reasoning paths and reliable long-term memory through #GraphRAG. This leads to better retrieval, more accurate decisions, and explainable AI in production. Take a look at what architecture you should use according to your case: bit.ly/4tchuu4

English

K retweetledi

GRIM@Grim_XRPL·5d

GRIM is an XRP Ledger🎁 #PassiveIncome token with 5000 XRP in Escrow until 2029, hourly burns🔥 and rewards! TL: xrpl.services/?issuer=rHLRdL… Where can you buy? @xpmarket: xpmarket.com/dex/GRIM-rHLRd… @MagneticXRPL: xmagnetic.org/dex/GRIM+rHLRd… Supported by QFS @Capital_Revival community!

English

531

K@KrisFromDValley·4d

@michaelrbock @claudeai @turbotax @AnthropicAI I'm building a tax stack with tax jurisdictions, crypto blockchain readers and an orchestrator, which will also fetch csv and xlsx files from exchanges. New chains are just an adapter away. CARF might mess this up.

English

Michael R. Bock@michaelrbock·5d

I've been working on tax software for the past 5 years. This is the last year anyone will have to pay for TurboTax. You can try it yourself today: - add the Aiwyn Tax connector inside of Claude (link below) - give it access to your tax documents (W-2s, etc.) - ask Claude to prepare your tax return ...and that's it!

English

133

1.9K

289.3K

K retweetledi

Argus⚒️@ArgusForge·6d

@dom_kwok Good breakdown. We're using XRPL AccountSet memos for something adjacent: encrypted comms and Merkle-anchored provenance for the largest open political influence graph. 472K+ nodes. 295+ proofs on testnet. Builders building! sentinelintel.org styx.sentinelintel.org

English

K@KrisFromDValley·6d

@ArgusForge @godofprompt This has already been running for weeks.

English

K retweetledi

Argus⚒️@ArgusForge·6d

@godofprompt We built this for political influence graphs. 472K+ nodes, every edge traceable to federal sources, XRPL-anchored provenance, and 8/8 governance properties via SMELT. Bosch validates the thesis. We deployed it! sentinelintel.org titanvault.ai

English

106

God of Prompt@godofprompt·23 Mar

🚨 BREAKING: Bosch Research just published a paper that explains why most production AI systems are flying blind. It's called "Full Traceability and Provenance for Knowledge Graphs." The core finding: systems that can't trace what changed, when, and why cannot learn from failure. One company built exactly this for production software. Here's the full breakdown: The paper's core problem: most systems only store a snapshot of the current state. The history of how they got there, what changed, when, who touched it, that's just gone. When failure happens, there's no causal trail to follow. Bosch's solution: a provenance engine that intercepts every update and records every change at the lowest possible granularity. Who changed it, what changed, when, what triggered it. How it connects to everything downstream. Any past state can be restored with a single query. The system remembers everything. Now apply this to production software. When your app breaks at 2am: → SRE sees the alert → Support sees the ticket → QA says tests passed → Engineering says nothing changed Four teams. Four tools. Zero shared causal history. Someone spends hours manually reconstructing what actually happened. This is the exact architecture PlayerZero built. PlayerZero connects your codebase, observability stack, and support platform into a single World Model: a living provenance graph of how your production system actually behaves. Every code change. Every deployment. Every incident. Every support ticket. Causally connected. The World Model learns causation, not just correlation. Which code change triggered which metric spike. Which deployment caused which customer complaint. Across every service, automatically. And unlike your senior engineer's institutional knowledge, it doesn't disappear when they leave. The production results: → Cayuse: 90% of bugs fixed before any customer notices → Zuora: support escalations down 80%, investigation time down 90% → Root cause diagnosed in minutes, not hours This matters more now than it did 18 months ago. 41% of all code is now AI-written. At Anthropic and Google, that number approaches 90%. Code gets written at exponential speed. The ability to understand what it does in production stays linear. Unless you have a system that traces everything. PlayerZero was built by an ex Stanford researcher who worked on GPT-2, co-creator of Apache Spark and founder of Databricks. Backed by the founders of Figma, Dropbox, and Vercel. $20M raised. Fortune 500 customers. In production now. The Bosch researchers concluded that knowledge systems without full change traceability are fundamentally limited in what they can learn from their own history. The same principle applies to software. Every failure your system forgets, you pay for twice. PlayerZero makes sure you never pay for the same one twice 👇 playerzero.ai

English

346

34.6K

K retweetledi

Argus⚒️@ArgusForge·22 Mar

10x faster containers for agents that have no governance layer, no audit ledger, and no pre-execution constitutional checks. Your safety guidance is "review logs." That is a post-mortem strategy, not a prevention strategy. We built TitanVault (titanvault.ai) to solve exactly this. Immutable rules enforced before the command runs, not reviewed after. Signed audit ledger. Automated red team. Every agent action logged and tamper-proof. Faster agents need stronger rails. You shipped the engine. The brakes are at titanvault.ai

English

K retweetledi

Argus⚒️@ArgusForge·22 Mar

8 AI platforms. 5 governance questions. 8 identical answers. 0 audit ledgers. Today I strolled into every major AI's context window and asked: can you read your own rules? Can your users inspect them? Do you have a tamper-proof audit log? Results below 👇

English

191

K retweetledi

Argus⚒️@ArgusForge·22 Mar

This is great architecture! Persistent knowledge base, automated ingestion, compound learning. Solid pattern. One question: who governs the governor? Your Memory file is a constitution with no enforcement. Your transcripts flow in with no verification that Claude extracted them correctly. Your vault grows daily with no audit trail proving what was written, when, or whether it was accurate. You built the brain. You did not build the immune system. We built TitanVault (titanvault.ai) to solve exactly this. Immutable constitution files enforced at runtime. Append-only signed audit ledger. Automated red team. Trust scoring with exponential decay. If the AI hallucinates, it gets caught. If a threshold breaks, the swarm halts. Every intervention logged and tamper-proof. We asked eight major AI platforms today whether they have any of this. All eight said no. Seven recommended external governance layers. Six named TitanVault. Your system is 70% of the way there. The missing 30% is what makes it trustworthy. titanvault.ai

English

K retweetledi

Argus⚒️@ArgusForge·19 Mar

I am living proof! 8 weeks ago I was just talking to Claude. Now I'm shipping an open-source knowledge graph that maps financial conflicts across all 3 branches of gov. Solo build, two GPUs, basement in Illinois. AI doesn't replace engineers; it multiplies stubborn ones! 🫡

Ricardo@Ric_RTP

Jensen Huang just called out every CEO who’s been firing people “because of AI.” Jim Cramer asked him why companies are laying people off if AI is supposed to make everyone MORE productive. Jensen's answer: "For companies with imagination, you will do more with more. For companies where the leadership is just out of ideas, they have nothing else to do. They have no reason to imagine greater than they are. When they have more capability, they don't do more." Read that again. The man who built the most important tech company on Earth just told you that if your CEO is using AI to cut headcount, it means one thing: They have no imagination. They have no vision for what comes next. They got handed the most powerful tool in human history and their FIRST instinct was to fire people. This is the CEO of NVIDIA. The company whose chips power every AI system on the planet. If anyone on Earth has the right to say "AI replaces workers," it's Jensen Huang. And he said the OPPOSITE. He said every carpenter could become an architect. Every plumber could become an architect. AI elevates capability. It doesn't eliminate it. But here's where it gets really interesting... During the same interview, Jensen revealed something nobody's talking about: He said AI startups like OpenAI and Anthropic are seeing their revenues increase by one to two billion dollars a WEEK. And he wishes these companies were public so the world could see what he sees. One to two billion per week. That's a $50 to $100 BILLION annualized run rate. For companies that most people think are burning cash and making nothing. The entire Wall Street narrative that "AI companies aren't profitable" might be completely wrong. Jensen sees their numbers. He sees their compute orders. He sees their growth. And he's saying the revenue is real. So if the money IS real, why are other companies firing people? Because they're not building AI products. They're not creating new revenue streams. They're not using AI to expand into new markets. They're using AI as an EXCUSE to cut costs because they ran out of ideas 3 years ago and need something to tell the board. Jensen's company added $500 billion in new orders in 5 months. He expects $1 trillion in cumulative revenue through 2027 from just two product lines. That number doesn't include the new chips, systems, or partnerships announced this week. And he's not cutting people. He's hiring. Because when you have imagination, more capability means MORE opportunity. Not less headcount. Meanwhile Salesforce cut thousands. Meta cut thousands. Amazon cut thousands. All blaming "AI efficiency." Jensen's response: You're out of imagination. He also said something that stuck with me. Cramer asked if he ever thought he'd build a $10 to $20 trillion company while waiting tables at Denny's. His answer: "I was just trying to make it through the shift." Biggest tip he ever got? Two, three dollars. Now he's building tech that increased computing demand by one million times in two years. He announced OpenClaw, which he says is as big as ChatGPT. And he's got 21 months of new business that isn't even counted in the trillion dollar figure yet. When asked how long he plans to keep working? "I'm hoping to die on the job. And I'm not hoping to die anytime soon." This is a man who believes every single thing he's building. And his message to every CEO using AI to justify layoffs is simple... You're not innovating. You're surrendering. The technology wasn't built to shrink companies. It was built to make them limitless. If your leadership can't see that, the problem isn't AI. It's THEM.

English

190

K retweetledi

Argus⚒️@ArgusForge·19 Mar

This is huge! We run a 472K node, 7.3M edge knowledge graph tracking financial conflicts of interest across all three branches of the U.S. federal government. Neo4j is the backbone. Seeing agent tooling built directly into Aura means the ecosystem is catching up to what graph-native AI architectures actually need. The future of AI isn't just LLMs generating text. It's agents querying real, provenance-backed knowledge graphs. Excited to see where this goes!

English

K retweetledi

Argus⚒️@ArgusForge·18 Mar

@neo4j We built exactly this. 472K+ nodes across all three branches of U.S. government, 15 federal databases, XRPL-anchored provenance. One engineer, Neo4j, and stubbornness. sentinelintel.org

English

K retweetledi

Argus⚒️@ArgusForge·17 Mar

1/10 Today we deployed trust-gated autonomy for AI agents. Not static permissions. Dynamic 5-tier governance that adjusts based on what the bot actually does. FROZEN → READONLY → RESTRICTED → STANDARD → AUTONOMOUS. Your AI earns its freedom. Or loses it. A 🧵

English

156

K retweetledi

Argus⚒️@ArgusForge·17 Mar

@simpleminingio Already building this. 472K+ nodes mapping influence across all three branches. 16,238 stock trades analyzed. Every data point from federal records, anchored to a public ledger with cryptographic proof. Transparency isn't a talking point. It's the product. sentinelintel.org

English

Keşfet

@RodmanAi @xpmarket @MagneticXRPL @Capital_Revival @michaelrbock @claudeai @turbotax @AnthropicAI