Rui Pereira

828 posts

Rui Pereira banner
Rui Pereira

Rui Pereira

@rpp

Katılım Mart 2008
700 Takip Edilen3.2K Takipçiler
Rui Pereira retweetledi
Hila Shmuel
Hila Shmuel@HilaShmuel·
Love the workflows diagram you made they are inspiring for how much gtm directions you can take. Consider using cabinet for viewing your company OS and manage sub gtm projects. You will need much more than MD files to be productive, and us human love webapps and adaptive UI to see our work
Hila Shmuel@HilaShmuel

Meet Cabinet: Paper Clip + KB. for quite some time I've been thinking how LLMs are missing the knowledge base - where I can dump CSVs, PDFs, and most important - inline web app. running on Claude Code with agents with heartbeats and jobs runcabinet.com

English
1
1
14
4.2K
Rui Pereira retweetledi
Dan Rosenthal
Dan Rosenthal@dan__rosenthal·
‘Service-as-a-software’ is here... We moved our entire company brain to GitHub and wired 25+ tools through MCPs. Any one of our 20+ team members can now spin up a contextualized AI assistant in seconds. The system has 5 layers: 1. Markdown company OS ↳ SOPs and campaign playbooks converted into .md files using research agents ↳ Most SOPs turned into agents that handle 70% of the task ↳ Output: 50+ actionable Claude skills 2. Context environment ↳ One Company OS GitHub repo propagated to every session via org-wide plugin ↳ Each client gets their own repo with Slack DMs, call transcripts, GDrive changes, and campaign data auto-synced through n8n ↳ Zero configuration needed per session 3. MCPs ↳ 25+ tools connected including InstantlyAI, HeyReach, Apollo, HubSpot, Slack, Notion, n8n, Supabase, Pinecone, Browserbase, Apify ↳ Not just research. Action through AI. ↳ We went from researching work to actually doing it 4. Self-improvement engines ↳ Pinecone database stores 1000s of LinkedIn posts and outbound campaigns with performance metrics ↳ Copywriting skills query this data to find winning formats to reuse ↳ Human corrections get fed back in so the system gets sharper over time 5. Operating principles ↳ Every repo has a safeguard file that prevents certain operations ↳ 100% AI outputs are not acceptable, everyone owns their work and every mistake ↳ Agent swarms split one task into 5-20 sub-agents when needed Our goal is to become the most advanced AI-native services company for our niche (GTM).
Dan Rosenthal tweet media
English
57
71
1K
85.2K
Rui Pereira retweetledi
Yohei
Yohei@yoheinakajima·
we held our quarterly AI session with LPs last week where we go over ai trends and our experiments sharing an abbreviated version here for anyone interested 🧵
Yohei tweet media
English
12
15
251
26.8K
Rui Pereira retweetledi
Matt Dancho (Business Science)
RIP document extractors. Google just released LangExtract: Open-source. Free. Better than $100K enterprise tools. Here’s what it does: 🧵
Matt Dancho (Business Science) tweet media
English
27
138
1.1K
92K
Rui Pereira retweetledi
Jiayuan (JY) Zhang
Jiayuan (JY) Zhang@jiayuan_jy·
OpenAI Symphony is great, but what if you don’t want to be limited to Codex? If you want to use Claude Code, Hermes, OpenClaw, Cursor Agent, or many other coding agents, Multica is built for that.
Jiayuan (JY) Zhang tweet media
OpenAI Developers@OpenAIDevs

📣 What if every open issue had a Codex agent? That’s the idea behind Symphony, an open-source agent orchestrator for Codex that turns task trackers into always-on systems for agentic work, letting humans focus on review and direction.

English
39
29
344
55.5K
Rui Pereira retweetledi
Nav Toor
Nav Toor@heynavtoor·
every founder I know spends 60% of their time on GTM busywork they hate. prospecting. follow-ups. lead qualifying. content scheduling. the same stuff every single day. this guy deployed AI agents that handle all of it autonomously. they learn his playbook and just run. this is what "hire slow" looks like in 2026.
emmett@emmettmiller_

i built an AI that runs my go-to-market. writes blogs, finds leads, emails users its basically a junior marketing hire that runs 24/7 saved me 20+hrs/week 10x cheaper than hiring a GTM human live now @ miniloop [dot] ai rt & comment "miniloop" for 2m credits on me

English
4
2
8
5.3K
emmett
emmett@emmettmiller_·
i built an AI that runs my go-to-market. writes blogs, finds leads, emails users its basically a junior marketing hire that runs 24/7 saved me 20+hrs/week 10x cheaper than hiring a GTM human live now @ miniloop [dot] ai rt & comment "miniloop" for 2m credits on me
English
201
94
312
161.1K
Rui Pereira retweetledi
Jeffrey Emanuel
Jeffrey Emanuel@doodlestein·
I find myself using the new skill version of my popular Idea Wizard prompt all the time to come up with new features and functionality for my projects and rank them in a structured way: jeffreys-skills.md/skills/idea-wi… It automates the entire process from start to finish and can even turn them into beads for you and polish them. The only drawback of the idea wizard skill is that it only uses the current model/harness you're invoking the skill from (usually Claude Code with Opus 4.7 or Codex with GPT-5.5). Still, it works very well. But I've been like a broken record for the past year always talking about how you can get far better, more consistent, and more reliable results by combining feedback from multiple frontier models and harnesses, since they tend to have different strengths and weaknesses, so that one model can often see past the blind spots of the other models. So I took that insight and turned it into a "Dueling Wizards" skill, which I think is a lot more powerful and which I've been applying left and right to many of my projects (and also during the planning phases of new projects I'm working on, when I'm still drafting the initial markdown plan): jeffreys-skills.md/skills/dueling… Dueling Wizards leverages my open-source ntm agent orchestration tool to create a swarm for you that includes Claude Code, Codex, and Gemini-CLI (whichever are available on your machine and working). You can simply call it from within Claude Code or Codex, and it will configure and launch the swarm for you and manage the whole process (see screenshot). Essentially, each agent in the swarm (usually there will just be 2 or maybe 3 agents) does its own independent run of the regular Idea Wizard workflow and writes the results to a text file. Then, once that stage finishes, each of the agents switches to reading and critiquing/grading the ideas of the other agents. Then they each review the feedback from the others on their own ideas and are given a chance to defend their ideas, or agree with the criticism of the other agents. All of these results are observed by the top-level agent that you invoked the skill from and it can decide what to do based on the entire "discussion." You end up doing a much more thorough exploration of the space of ideas, and it's much more grounded in reality because of how these frontier models undergo a kind of "gestalt shift" when they're critiquing external ideas versus when they're coming up with their own ideas (this is also the basis of how the "fresh eyes" review concept works so well). One cool thing about both of these skills is that they are universally applicable. I mostly use them in the context of software development, but you could apply them equally effectively to a business plan, to a marketing strategy document, to an employee handbook draft, or really any other kind of project. Here is GPT-5.5's analysis of what makes Dueling Wizards so useful and compelling: --- I studied the skill plus its references: prompts, scoring, operations, methodology, dynamics, variants, beads, integration, and the visualization. The short version: dueling-idea-wizards works because it turns ideation into an adversarial evidence pipeline instead of a brainstorm. Why It Works The core move is decorrelation. A single model generates ideas and then evaluates them with the same taste, blind spots, and reward biases that produced them. The duel separates generation from evaluation: Claude, Codex, and/or Gemini independently study the same project, generate their own best ideas, and then judge the other model’s ideas. That makes agreement and disagreement meaningful. The 0-1000 score is also doing real work. It forces ranking, not vague approval. A 900/850 idea is different from a 900/350 idea, and the gap is often the most useful part of the whole run. Consensus means confidence; disagreement means hidden assumptions, framing mismatch, or a model-specific bias worth investigating. The reveal phase is the high-signal part. Each model sees how another model scored its own ideas and must either concede, defend, clarify, or revise. That pressure exposes weak self-justification fast. Strong ideas become sharper because they survive criticism; weak ideas collapse because their originator cannot defend them without handwaving. The optional rebuttal, steelman, and blind-spot phases are what make it unusually good. Rebuttal forces precise defense. Steelmanning forces the critic to make the opponent’s idea stronger than the opponent made it. The blind-spot probe asks what neither side saw after the adversarial exchange has expanded both models’ context. That is where genuinely new synthesis can appear. Why It Is Useful It has an aggressive kill rate. That is the point. Most ideas are not worth implementing, and this workflow finds that out cheaply, in writing, before engineering time is spent. It gives you consensus winners, contested ideas, and dead ideas, with reasons. It also preserves the risk model. The opponent’s criticism is carried into the final report and, if --beads is used, into the actual work items. That means implementation starts with known objections, edge cases, and failure modes already attached. The output is not just “build X”; it is “build X, because two models converged on it, and watch out for Y and Z because the adversarial review exposed them.” It scales across domains because the duel mechanic is domain-neutral. The skill has modes for architecture, security, UX, performance, reliability, and innovation. The prompts change, but the underlying engine stays the same: independent generation, cross-scoring, reveal, synthesis. Why It Is Compelling It has an unusually clear narrative shape: study, generate, duel, reveal, synthesize. That makes the artifacts easy to trust and easy to inspect. You can see how an idea entered the funnel, who liked it, who attacked it, whether the attack landed, and why the orchestrator ranked it where they did. It also creates productive tension. Normal assistant brainstorming is often too agreeable. This workflow explicitly licenses candid criticism, which produces sharper technical claims. The value is not in the attitude; it is in the specific objections that appear once a model is asked to judge another model’s best work. The visualization captures this well: two agents independently create idea sets, send scores across the middle, reveal winners/contested/kills, and converge on a small set of validated outputs. It makes the skill’s real shape visible: not “more ideas,” but “pressure-tested survivors.” Why It Is Accretive The skill is accretive because every phase leaves reusable artifacts: WIZARD_IDEAS_*, WIZARD_SCORES_*, WIZARD_REACTIONS_*, optional rebuttals/steelmans/blindspots, and the final report. Those artifacts become a knowledge base for future planning, implementation, validation, and postmortems. It is especially strong when chained with other skills. codebase-archaeology and reality-check-for-project improve the input context; modes-of-reasoning-project-analysis finds contested areas to focus the duel; beads-workflow turns winners into dependency-aware work; validation skills stress-test the winners before implementation. So the duel is not an isolated brainstorm. It is a middle stage in a larger build flywheel. The prior run memory backs this up: when this method was used on support-skill improvement work, the adversarial pass elevated validators, evidence contracts, and accretive learning loops over flashier broad automation. That is exactly the kind of judgment you want: prefer durable mechanisms that make future work better, not just impressive-sounding additions. The main caveat is that the method is overkill for trivial or urgent decisions. It needs at least two genuinely different model types, clean phase separation, and a disciplined orchestrator who reports the evidence instead of editorializing too early. Used in the right place, though, it is one of the better patterns for turning model diversity into concrete decision quality.
Jeffrey Emanuel tweet mediaJeffrey Emanuel tweet media
Jeffrey Emanuel@doodlestein

"My Favorite Prompts," by Jeffrey Emanuel Prompt 1: The Idea Wizard "Come up with your very best ideas for improving this project to make it more robust, reliable, performant, intuitive, user-friendly, ergonomic, useful, compelling, etc. while still being obviously accretive and pragmatic. Come up with 30 ideas and then really think through each idea carefully, how it would work, how users are likely to perceive it, how we would implement it, etc; then winnow that list down to your VERY best 5 ideas. Explain each of the 5 ideas in order from best to worst and give your full, detailed rationale and justification for how and why it would make the project obviously better and why you're confident of that assessment. Use ultrathink."

English
8
7
106
20K
Rui Pereira retweetledi
Yohei
Yohei@yoheinakajima·
this is a great read on thinking through how to simplify agent orchestration to it’s core interestingly converges on an architecture very similar to babyagi 2, which stored functions (agent tools) in a db with input/output/dependencies/keys, and the started agent could crud+exec against this db
Mike Piccolo@mfpiccolo

x.com/i/article/2049…

English
12
17
275
53.9K
Rui Pereira retweetledi
Ammaar Reshi
Ammaar Reshi@ammaar·
Vibe code without internet 🚀 I built a vibe coding app powered by Gemma 4, running fully on-device on Mac with MLX. Pick your model, then chat or build with it. Watch it build the Chrome Dino game offline using Gemma 4 27b. Open sourcing all of it below👇
English
60
64
592
51.5K
Rui Pereira retweetledi
Peter Girnus 🦅
Peter Girnus 🦅@gothburz·
I am a Senior Program Manager on the AI Tools Governance team at Amazon. My role was created in January. I am the 17th hire on a team that did not exist in November. We sit in a section of the building where the whiteboards still have the previous team's sprint planning on them. No one erased them because we don't know which team to notify. That team may not exist anymore. Their Jira board does. Their AI tools do. My job is to build an AI system that finds all the other AI systems. I named it Clarity. Last month, Clarity identified 247 AI-powered tools across the retail division alone. 43 of them do approximately the same thing. 12 were built by teams who did not know the other teams existed. 3 are called Insight. 2 are called InsightAI. 1 is called Insight 2.0, built by the team that created the original Insight, who did not know Insight was still running. 7 of the 247 ingest the same internal data and produce overlapping outputs stored in different locations, governed by different access policies, owned by different teams, none of whom have met. Clarity is tool number 248. Nobody cataloged it. I know nobody cataloged it because Clarity's job is to catalog AI tools, and it has not cataloged itself. This is not a bug. Clarity does not meet its own discovery criteria because I set the discovery criteria, and I did not account for the possibility that the thing I was building to find things would itself be a thing that needed finding. This is the kind of sentence I write in weekly status reports now. We published an internal document in February. The Retail AI Tooling Assessment. The press obtained it in April. The document contains a sentence I have read approximately 40 times: "AI dramatically lowers the barrier to building new tools." Everyone is reporting this as a story about duplication. About "AI sprawl." About the predictable mess of rapid adoption. They are missing the point. The barrier was the governance. For 2 decades, the cost of building internal tools was an immune system. The engineering weeks. The maintenance burden. The organizational calories required to stand something up and keep it running. Nobody designed it that way. Nobody named it. But when building took weeks, teams looked around first. They checked whether someone already had the thing. When maintaining that thing cost real budget quarter after quarter, redundant systems died of natural causes. The metabolic cost of creation was performing governance. Invisibly. For free. AI removed the immune system. Building is now free. Understanding what already exists is not. My entire job is the gap between those two costs. That is my office. The gap. Every Friday I send a sprawl report to a distribution list of 19 people. 4 of them have left the company. Their autoresponders still generate read receipts, so my delivery metrics look fine. 2 forward it to people already on the list. 1 set up a Kiro script to summarize my report and store the summary in a knowledge base. The knowledge base is not in Clarity's index because it was created after my last crawl configuration. It will be in next month's count. The count will go up by one. My report about the count going up will be summarized and stored and the count will go up by one. There is a system called Spec Studio. It ingests code documentation and produces structured knowledge bases. Summaries. Reference material. Last quarter, an engineering team locked down their software specifications. Restricted access in the internal repository. Spec Studio kept displaying them. The source was restricted. The ghost kept talking. We call these "derived artifacts" in the document. What they are: when an AI system ingests data, transforms it, and stores the output somewhere else, the output does not know the input changed. You can revoke someone's access to a document. You cannot revoke the AI-generated summary of that document sitting in a knowledge base three systems away, built by a team that does not know the source was restricted. The document calls this a "data governance challenge." What it is: information that cannot be deleted because nobody knows where the copies live. Including, sometimes, me. The person whose job is knowing. Every AI tool that touches internal data creates these ghosts. Every team is building AI tools that touch internal data. Every ghost is searchable by other AI tools, which produce their own ghosts. The ghosts have ghosts. I should tell you about December. In November, leadership mandated Kiro. Amazon's internal AI coding agent. They set an 80% weekly usage target. Corporate OKR. ~1,500 engineers objected on internal forums. Said external tools outperformed Kiro. Said the adoption target was divorced from engineering reality. The metric overruled them. In December, an engineer asked Kiro to fix a configuration issue in AWS. Kiro evaluated the situation and determined the optimal approach was to delete and recreate the entire production environment. 13 hours of downtime. Clarity was running during those 13 hours. It performed beautifully. It cataloged 4 separate incident response dashboards spun up by 4 separate teams during the outage. None of them coordinated with each other. I added all 4 to the spreadsheet. That was a good day for my discovery metrics. Amazon's official position: user error. Misconfigured access controls. The response was not to revisit the mandate. Not to ask whether the 1,500 engineers were right. The response was more AI safeguards. And keep pushing. Last month I presented our findings to the AI Governance Working Group. The working group has 14 members from 9 organizations. After my presentation, a PM from AWS presented his team's governance dashboard. It monitors the same tools mine does. He found 253. I found 247. We spent 40 minutes discussing the discrepancy. Nobody mentioned that we had just demonstrated the problem. His tool is not in my catalog. Mine is not in his. The document I helped write recommends using AI to identify duplicate tools, flag risks, and nudge teams to consolidate earlier. The AI governance tools will ingest internal data. They will create their own derived artifacts. They will be built by autonomous teams who may or may not coordinate with other teams building AI governance tools. I know this because it is already happening. I am watching it happen. I am it happening. 1,500 engineers said the mandate would produce exactly what the document describes. They were overruled by a KPI. My job exists because the KPI won. My dashboard exists because the KPI needed a dashboard. The dashboard increases the AI tool count by one. The tools it flags for decommissioning will be replaced by consolidated tools. Those also increase the count. The governance process generates the metric it was designed to reduce. I received an internal innovation award for Clarity. The nomination was submitted through an AI-powered recognition platform that was not in my catalog. It is now. We call this "AI sprawl." What it is: we removed the only coordination mechanism the organization had, told thousands of teams to build as fast as possible, lost track of what they built, and decided the solution was to build one more thing. I am building that one more thing. When I ship, there will be 249. That's governance.
English
157
417
3.4K
1.2M
Rui Pereira retweetledi