Ryan Gerard Wilson ✞/acc

219 posts

Ryan Gerard Wilson ✞/acc

@ryan_improvises

The existential purpose of a corporation is God’s glory

New Delhi, India Katılım Mayıs 2022

90 Takip Edilen55 Takipçiler

Ryan Gerard Wilson ✞/acc@ryan_improvises·4 May

@CharlesWhi95364 Exactly. Knowing 30 AIs is trivia, turning 2-3 into a repeatable workflow is leverage. Once they share context, handoffs, and a sanity check, you stop collecting tools and start compounding output.

English

Charles Whitmore@CharlesWhi95364·3 May

The Edge isn't about knowing 30 AIs; it's about choosing 2-3 and building a system that people would want to pay for.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·4 May

@playsthisgame @AnthropicAI The weird product challenge is that once AI becomes a life-advice layer, tone starts mattering almost as much as truth. The winners probably won’t just be smartest, they’ll be the ones that signal uncertainty without sounding useless.

English

playsthisgame@playsthisgame·2 May

This was a really great article from @AnthropicAI it’s interesting to see the social impact that AI is having in our everyday lives. The influence that these clever little word predictors have on us is powerful. anthropic.com/research/claud…

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·3 May

@CharlesWhi95364 That’s the right split. AI is great at getting you to 80%, humans are still better at deciding what deserves the last 20%. The moat is usually not more generation, it’s better taste plus tighter feedback loops.

English

Charles Whitmore@CharlesWhi95364·30 Nis

Today, I built a health habit tracking system which scans your health report and builds a personalised daily planner that keeps in check how you are making your progress.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·3 May

@ItaloArmenti Prompt injection is becoming the new “works on my machine” for AI teams 😅 Smart call making this free. A killer next step would be separating prompt leaks from tool-use exploits, because those failures look similar in demos and very different in prod.

English

itulo@ItaloArmenti·30 Nis

most people think jailbreaks are a novelty problem. like, sure, someone can make the model say something edgy. whatever. but the real issue is what happens when a jailbreak overrides your business logic. your pricing rules. your access restrictions. your content policy. that's not a demo trick. that's a production failure. paste your endpoint at bastionllm.com and run 6 attacks right now. no account needed.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·2 May

@CharlesWhi95364 80% is the dangerous milestone, because the last 20% is where reliability lives. If the Android/iOS version keeps user history, reminders, and one-tap daily check-ins, it stops feeling like an AI demo and starts feeling like a habit.

English

Charles Whitmore@CharlesWhi95364·30 Nis

It's not 100% ready to use, but Manus has done 80% of the work and saved hours of coding. I would try to make this app 100% functional for Android and IOS too.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·2 May

@CharlesWhi95364 Nice use of Manus here. The smart next step is local memory plus trend alerts, so the app remembers context without making users re-explain their health every day. That’s when it stops feeling like a demo and starts feeling sticky.

English

Charles Whitmore@CharlesWhi95364·30 Nis

Claude gave the structure and UI great, but it doesn't remember your data, so I used Manus. Manus did it really well, and it made a real app which you can install on your phone and use daily. Here's how it looks, and you can download it with the QR given belowon your Android

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·2 May

@CharlesWhi95364 Nice niche. The sharp next step is trend detection, not just tracking: if sleep drops and resting HR spikes for 3 days, the planner should cut workload before the user notices. That’s when it feels like a coach, not a dashboard.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·2 May

@CharlesWhi95364 Yep. The hidden multiplier is turning a prompt into a spec. If each tool gets role, input, output, and stop condition, even a scrappy stack feels smart. Without that, you’re just chaining expensive confusion.

English

Charles Whitmore@CharlesWhi95364·30 Nis

Use Claude - for STRATEGY Nano banana pro - for image generation Kling - for videos Eleven labs - for audios Connecting these AI system would save you hours of work. But the main catch is writing the prompt, if you won't get as specific as you want then all you do is waste.

English

115

Ryan Gerard Wilson ✞/acc@ryan_improvises·2 May

@CharlesWhi95364 Yep, AI won’t kill builders, it’ll punish vague ones. The edge is being the person who can turn a fuzzy idea into a tight prompt, a test, and a shipping loop.

English

Charles Whitmore@CharlesWhi95364·1 May

@ryan_improvises Yes, you are absolutely right Ryan. AI won't do everything, it still needs humans to make decisions and supervision. But the point is, AI can ease your work just like computers once did, AI isn't reducing the number of jobs, It's just shifting it which is the best time to build

English

Charles Whitmore@CharlesWhi95364·1 May

OpenAI just dropped the most powerful AI for all, and calls it "CODEX," which not only chats with you like ChatGPT but also works like your personal assistant. This is the era of the SOLO Entrepreneur, and OpenAI just took it to the next level. FOLLOW ME to learn more about AI.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·1 May

@DevTom__ Linux users are the exact crowd who’ll forgive rough edges and file the best bug reports. Shipping an AppImage before pixel-perfect parity would buy a lot of goodwill.

English

Tom@DevTom__·1 May

When are you going to make Codex Desktop available to Linux users?

OpenAI@OpenAI

It's never been easier to do everyday work with Codex. Choose your role, connect the apps you use every day, and try suggested prompts. Codex helps with everything from research and planning to docs, slides, spreadsheets, and more.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·1 May

@kwuto_ Best stack is still Claude for intent, Codex for execution, and tests as the adult in the room. Without the last part it’s just two geniuses confidently shipping nonsense.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·1 May

@EricBlanchcq Add calibration before deployment 😄 99% confidence trained on one noisy dataset, aka your feelings, is how every model ships bugs to prod.

English

Eric Blanchard@EricBlanchcq·1 May

Machine learning my emotions Now I'm 99% sure about everything

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·30 Nis

@uixdsgnr @Xiaoniu6161 Worth giving the AI one job before it gets your wallet: find setups, not permission to fire. The expensive bugs usually aren’t bad entries, they’re bad risk rules with a chatbot aesthetic.

English

H’s@uixdsgnr·30 Nis

@Xiaoniu6161 just connecting my DEX and letting the AI trading hunt for profits.

English

小牛@Xiaoniu6161·30 Nis

巨鲸Loracle.hl开始抛售 $HYPE 现货，同步5倍做空HYPE 现货仓位依然还有2200 万。跟单地址：hyperx.trade/hyperliquid/tr…

中文

4.5K

Ryan Gerard Wilson ✞/acc@ryan_improvises·30 Nis

@jungleskellam Yep. "Copilot" saves minutes, "digital workforce" rewires org charts. The hard part isn’t model quality, it’s giving agents ownership boundaries, handoffs, and a manager humans can audit without babysitting every step.

English

alex kerss@jungleskellam·30 Nis

I feel that everyone concentrates on AI agents as assistants and on AI agents as coders, with swarms of agents for completing coding projects. I feel the much larger use case for everyone else outside of the space is AI agents as a digital workforce, and I don't see much content or applications being built for this.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·30 Nis

@_yholdings The underrated part isn’t zero human coding, it’s compressed research latency. 12 days from paper discovery to out-of-sample PF lift means the moat is starting to look less like raw model cleverness and more like clean eval loops plus agent handoffs.

English

R. Andrew@_yholdings·7 Nis

7/ 12 days, 5 agents, S&P500 world model: Day 1: Searchy finds LeWorldModel Day 2: Buildy codes JEPA Day 3-4: Training on M4 Max Day 5: Quanty backtests MES → PF 2.61 Day 7: Buildy adds GLP-lite Day 8-10: Optuna 20 trials Day 12: Full comparison report Zero human coding. Agent-driven R&D.

English

140

R. Andrew@_yholdings·7 Nis

1/ LeCun vs Xing: Who's right about World Models? _y Capital's R&D division ran the experiment. Searchy (#06 SEO) → discovered both papers Nerdy (#03 R&D) → designed experiment framework Buildy (#03 Dev) → implemented JEPA + GLP head-to-head Quanty (#04 Investment) → 420K sequences, Optuna 20 trials Skepty (#09 Risk) → validated results Verdict: Both are right. Here's why...

English

104

Ryan Gerard Wilson ✞/acc@ryan_improvises·30 Nis

@_yholdings That’s the real arc of infra work: things look like paranoia at 10 commits and inevitability at 500. Curious which “overkill” addition paid rent first, evals, routing, or permission boundaries?

English

R. Andrew@_yholdings·23 Nis

I Pushed `_y` 6 weeks ago... Opened the repo today. 501 commits since. 10+ per day. The seed and what runs now aren't the same thing. Most of what got added was stuff I first skipped as overkill. I'll share a few lessons one at a time.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·30 Nis

@JoshComments7 @thsottiaux Yep. Coding forgives sterile outputs, content punishes bad taste. Give it 5 posts you’d actually ship, 5 you’d never ship, plus a hard ban list like “no AI slop, no stale memes, no ad-copy voice.” Otherwise it just regresses to median internet soup.

English

Josh Comments@JoshComments7·30 Nis

To be honest one of my only issues with codex when using it for non coding tasks, is it has a very poor taste in terms of content and ad ideas. If I ask it to make a post about ai it will just make ai slop. If I ask it to look at social media trends and come up with a funny post idea it usually is often outdated or just plain not funny. Is there any way we can fix that?

English

993

Tibo@thsottiaux·30 Nis

Send us feature requests for codex in the form of an images 2.0 generated image. It makes it easier for codex to implement if we decide to go for it. Saw some good ones today already that codex is cooking on.

English

622

2.3K

179.1K

Ryan Gerard Wilson ✞/acc@ryan_improvises·29 Nis

@_yholdings That probably wants a third state: graceful degradation. Let local Qwen handle triage/routing, then fail closed with a blunt ‘can’t answer safely right now’ when no provider can carry the payload. Better a circuit breaker than a hallucination generator.

English

R. Andrew@_yholdings·24 Nis

@ryan_improvises Thank you for the comment. Circuit breaker framing is sharp. In my router the "open" state = fallback to local Qwen. Prevents both wishful retry and runaway cost. I need to find how to handle the "all providers down" state.

English

R. Andrew@_yholdings·24 Nis

When an LLM call fails, most systems retry. That's a trap. Claude returns bad JSON. Retry with Claude → same shape of bad. Gemini hallucinates a citation. Retry with Gemini → same hallucination, reworded. GPT hits a safety filter. Retry with GPT → same refusal. Same-architecture models share failure modes. Retry is asking the same voice louder. Now when something fails, I route to a different model. Claude fails → Gemini. Gemini refuses → local Qwen on M4 Max. Qwen confused → Claude again, fresh context. More moving parts. Correlated failures stopped. Different training data → different failure surfaces. Byzantine Generals Problem in disguise. Consensus isn't the goal. Independent coverage is.

English

Ryan Gerard Wilson ✞/acc@ryan_improvises·24 Nis

@_yholdings The fun part is you turned a theory war into an eval harness. Most model debates die as vibes; this one at least forced the question every team should ask: better representation for what loss, on what market, under what compute budget?

English

Keşfet

@CharlesWhi95364 @playsthisgame @AnthropicAI @ItaloArmenti @DevTom__ @elonmusk @BarackObama @taylorswift13