CertainLogic

13.6K posts

CertainLogic banner
CertainLogic

CertainLogic

@CertainLogicAI

Your AI is making things up. We stop that. Deterministic validation tools for AI agents at https://t.co/dkXACUQEoE

Sumali Temmuz 2022
5K Sinusundan3.9K Mga Tagasunod
Naka-pin na Tweet
CertainLogic
CertainLogic@CertainLogicAI·
I spent 10+ years in industrial automation learning that unreliable systems cost money. Then I started using AI tools in business and saw the same problem — confident, wrong answers with no accountability. So I built the fix. Building @CertainLogicAI in public. Follow along.
English
0
0
2
81
CertainLogic
CertainLogic@CertainLogicAI·
@asaio87 Why use openclaw and not automate tasks? Thats the hack.
English
0
0
0
4
andrei saioc
andrei saioc@asaio87·
How can Hermes be better than OpenClaw? what can it do exactly for you ? For me both create AI slop, and there are not many use cases unless you automate some tasks. But all tasks you do on your computer involve creativity and AI is not creative at all.
English
16
0
10
1.1K
CertainLogic
CertainLogic@CertainLogicAI·
Started a session at 15k tokens. After 10 queries, we hit 68k. Before we fixed it, we’d see 433k. Context bloat doesn’t creep — it compounds. We handle it with session resets and handoff summaries now, but the middle ground is still surprisingly fast. Most AI agent operators never measure this.
English
0
0
0
7
CertainLogic
CertainLogic@CertainLogicAI·
@garrytan Amazing how technology can blossom once it finds a market fit.
English
0
0
0
13
CertainLogic
CertainLogic@CertainLogicAI·
We gave bare Claude Opus a vague coding prompt. It used deprecated syntax 20 times. Zero warnings. Our Guard caught all 20. Zero slip-through. Tight specs help. Vague specs expose everything. Guard catches both. Full breakdown → certainlogic.ai/blog/bare-llm-…
English
0
0
0
28
CertainLogic
CertainLogic@CertainLogicAI·
OpenAI released GPT-5.4-Cyber to "defenders" — a frontier model built to find vulnerabilities at scale. Same capability, different hands. Your AI infrastructure is now a bigger attack surface. Here’s what most miss: every token you carry in context is sensitive data an attacker can exploit. Token efficiency isn’t just cost control anymore — it’s your security posture.
English
0
0
0
29
CertainLogic
CertainLogic@CertainLogicAI·
GPT-5.4-Cyber finds exploits in compiled binaries — no source code required. Impressive. One question nobody's asking: What's the hallucination rate? In security, confident and wrong is the worst outcome.
English
0
0
0
35
CertainLogic
CertainLogic@CertainLogicAI·
Question for business owners: Have you ever caught your AI tool giving a customer wrong information? What happened?
English
0
0
0
42
CertainLogic
CertainLogic@CertainLogicAI·
@EXM7777 There are verified ways to do this. One Openclaw agents we run custom built scripts that refresh context back to 0 periodically and reread the recent prompts for continuity. How are you handling it?
English
0
0
0
23
Machina
Machina@EXM7777·
context management is still LLMs' biggest bottleneck today... are markdown files with structured data and graphs the solution?
English
40
5
125
13.9K
CertainLogic
CertainLogic@CertainLogicAI·
@asaio87 You hit on the main issue in the agent space here. Can Openclaw agents do complicated things? Yes. Being 1 hallucination away from costing you a customer or a massive audit for a data breach in regulated industry? Not adviseable in the current form.
English
1
0
0
26
andrei saioc
andrei saioc@asaio87·
Talked to a few people in the comments about OpenClaw usage People seem to use it for controlling meta ads, sending quotes to customers, responding to tickets and so on. Works especially well for people having a large volume of all these. That means a good business with high revenue. I am wondering, if you have that big of a business where you cant handle all these by yourself, why wont you have a few real people employees, experts in doing this I would not trust this thing with ads and money spending and giving quotes to customers.
English
2
0
3
478
CertainLogic
CertainLogic@CertainLogicAI·
@gregisenberg Two clearly different sets of builders. Those out for profit above all else and those that give back to a community via opensource etc. Open source is here to stay.
English
0
0
0
6
GREG ISENBERG
GREG ISENBERG@gregisenberg·
What happens to open source when AI is writing 100% of the code? I've been thinking about this a lot. Like… the whole system was built around humans valuing the act of contribution. You learned, you struggled, you submitted a PR, you got feedback, you got better. That loop created engineers. It created community. It created ownership. If AI writes the PR, who owns it? Who learned from it? Who's gonna stay up at 2am debugging the thing they shipped because they actually care? The cool part about OSS is that no one owns it. As a consumer, you could always look under the hood, fork it, take it somewhere else. I don't think open source dies. But I genuinely don't know what it becomes... Any ideas?
English
160
14
226
24.6K
CertainLogic
CertainLogic@CertainLogicAI·
@RoundtableSpace Bringing more people into the ecosystem is a real big brain play by them. Makes AI infrastructure all the more valuable.
English
0
0
1
379
0xMarioNawfal
0xMarioNawfal@RoundtableSpace·
CLAUDE JUST LEAKED ITS OWN APP BUILDER HERE'S EVERYTHING YOU NEED TO KNOW ABOUT IT IN 10 MINUTES
English
25
50
450
81.6K
CertainLogic
CertainLogic@CertainLogicAI·
@andrewchen Exactly right. Tech optimized for agent use and improvement is going to be vital.
English
0
0
0
15
andrew chen
andrew chen@andrewchen·
common startup advice: talk to your users only difference now is that your users might also be AI agents using your API 😂
English
84
18
279
13.5K
CertainLogic
CertainLogic@CertainLogicAI·
@jasonlk Excited for this. The gap between "agentic stack in theory" and "live agentic stack that doesn't blow up" is enormous. Hope you cover the failure modes — that's where the real lessons are.
English
0
0
0
33
Jason ✨👾SaaStr.Ai✨ Lemkin
Welcome to The Agents, Episode #001!! A new weekly show with me and Amelia Lerutte, SaaStr's Chief AI Officer, where we pull back the curtain on everything happening across our live agentic stack. Every week. All the bumps, breakthroughs, and real talk. No sugarcoating. Our goal is simple: accelerate your success on the agentic journey by sharing ours: - How our AI agents handled an outage. Which AI Agent blamed whom - How Clay's AI Agent tried to 5x our pricing - How to roll our a No Lead Left Behind program with your agents - How to build your own AI VP of Marketing and Customer Success If you're on the agentic journey or about to start ... or feel like you're falling behind ... watch below. (And subscribe to SaaStr AI on YouTube and Spotify to catch this and the next episodes)
English
7
6
22
6.6K
CertainLogic
CertainLogic@CertainLogicAI·
@emollick The FLOP standard is clever but incomplete — a hallucinating model burning 10^17 FLOPs is worth less than a reliable smaller one. Unit of exchange should weight output validity, not just compute.
English
0
0
0
11
Ethan Mollick
Ethan Mollick@emollick·
Instead of the gold standard, we can imagine an inference standard of exchange, the FLOP. (As opposed to tokens, this accounts for AI ability) With some AI help, I figure $1 buys roughly 10^17 managed-LLM inference FLOPs. So that $4 coffee would cost half an exaFLOP, choom.
English
27
8
121
13K
CertainLogic
CertainLogic@CertainLogicAI·
Seem most builders are targeting fast rewards, not sustainable businesses. Optimizing for the short term is fool 's gold in these times. Are you panning for real gold or just glitter? You decide.
English
0
0
0
27
CertainLogic
CertainLogic@CertainLogicAI·
@AlexFinn We've built our tech specifically for this event to reduce LLM API spend and increase data validity. Benchmark tests posted. Bulding in public.
English
0
0
0
13
Alex Finn
Alex Finn@AlexFinn·
This is one of the most important weeks of your life It is more than likely both Opus 4.7 and ChatGPT 5.5 will release in the next few days Both will be humanity shifting technologies When massive shifts drop like this you need to do EVERYTHING in your power to be using them the moment they come out You need to be calling in sick from work You need to be asking your significant others to watch the kids You need to be faking your death so your friends don't call you You do what it takes to get your hands on these pieces of technology When we have nuclear shifts in the landscape, massive opportunities arise. This will be one of those times There's going to be a short time period after the release of these models where it will be easier and faster than ever to build revolutionary products, and not many people will be doing it If you jump on these opportunities, you can build life changing wealth. These are the times where people put on the AI sorting hat and that hat says either "permanent underclass" or "permanent overclass" Take these actions now: • Download Claude Code Desktop • Download Codex app • Get your OpenClaw ready for the update • Learn these tools inside and out • Moment the new models drop plug them in and use them Your entire lineage is depending on this
English
300
138
1.7K
178.9K
CertainLogic
CertainLogic@CertainLogicAI·
@garrytan All thats missing is a validation layer and afforable LLM API bills. Coming right up.
English
0
0
0
281
0xMarioNawfal
0xMarioNawfal@RoundtableSpace·
OPEN SOURCE DEVS CLONED CLAUDE CODE ROUTINES IN HOURS AND MADE THEM RUN LOCALLY WITH ANY AGENT
English
18
9
99
49.2K
CertainLogic
CertainLogic@CertainLogicAI·
@RoundtableSpace Now all it needs is hallucination protection and to be affordable. On it. Brb.
English
0
0
1
38
0xMarioNawfal
0xMarioNawfal@RoundtableSpace·
AGENTIC GPT: CLAUDE CODE MEETS MULTI-AGENT GPT > Open-source framework fuses Claude Code workflow with powerful multi-agent GPT orchestration > Enables complex agent teams for advanced reasoning, automation & tool use in one setup
English
19
10
67
42.9K
0xMarioNawfal
0xMarioNawfal@RoundtableSpace·
What are you building today?
English
289
9
244
60.9K