Rustem S

199 posts

Rustem S

@vigosun

AWS Solution Architect and AI Engineer Transducing knowledge from the void.

Lausanne, Switzerland Katılım Mart 2009

2.3K Takip Edilen493 Takipçiler

Rustem S@vigosun·15 Nis

@benedictk__ +1 Zurich or Geneva

Deutsch

Benedict Kerres@benedictk__·14 Nis

Ok hear me out - wine and codex. We can set this up. Vienna, Munich, Zurich - who be keen? We (that is OpenAI) take over a wine bar and invite you (codex / coding power users) to talk codex and coding.

English

234

606

38.3K

Rustem S@vigosun·10 Nis

Just back from @aiDotEngineer in London feeling inspired and energised. Amazing speakers, great conversations, incredible vibe! Special thanks to @swyx and the team for putting it all together. SF AI World Fair is next!

English

Rustem S@vigosun·17 Mar

Spec-Driven Development with LLMs introduces a new synchronization problem. We mostly solved code-to-test coverage. Now we need tooling that does the same across specs, code, and tests ensuring they don't drift apart.

English

Rustem S@vigosun·19 Şub

RLMs haven’t hit their “CoT moment” yet. Chain-of-Thought was: 1. prompt trick 2. latent capability shows up 3. added into model training RLMs are at step 1–2 today.

English

31.3K

Rustem S@vigosun·16 Şub

Just observed a massive improvement using playwright-cli vs playwright MCP for web app testing. github.com/microsoft/play… Here are the details that make it so effective. It shipped in Playwright v1.58.0 with ~35 commands vs ~20 in the original MCP. But the real win is the architecture: the CLI saves snapshots to disk instead of stuffing them into the LLM context using ARIA accessibility tree format. ~27K vs ~114K tokens for the same task. Could MCP do this too? Technically yes, but most MCP clients can't read files from disk. The protocol assumes data flows back to the model. The CLI was built for coding agents like Claude Code that can handle this natively.

English

186

Rustem S@vigosun·8 Oca

x.com/i/article/2009…

ZXX

Rustem S@vigosun·31 Ara

Why RL over SFT? Many tasks have infinite valid solutions—whether it's booking a flight, writing code, or navigating a UI. You can't demonstrate every path, but you can easily verify success. That's RL's edge: reward the goal, not the route. And unlike SFT models that freeze when they deviate, RL models learn to recover and adapt.

English

Rustem S@vigosun·24 Ara

The effective design of agentic apps relies on a complementary relationship where MCPs and Skills empower an agent to function intelligently. In this setup, MCPs serve as the secure interface or "hands" connecting the agent to external tools and data. Skills provide the "procedural knowledge" or "brain" that dictates how those tools should be used. Rather than competing, these components work in unison: an agent loads a Skill to learn the correct business process or standard operating procedure, which then guides it to execute actions precisely via the appropriate MCPs, ensuring workflows are both capable and correct.

English

Rustem S@vigosun·9 Ara

Just finished AI-Powered Search course by @treygrainger & @softwaredoug. It was a transformative deep-dive into modern search. This course moves beyond basic vector search to cover the production-grade techniques necessary for robust RAG and Agentic systems. From Learning to Rank and user signals to advanced agentic workflows, the hands-on, platform-agnostic curriculum provides the tools to solve real-world retrieval problems. With expert guest lectures and a focus on self-improving systems, it offers the perfect blend of theory and code. Highly recommended for any engineer responsible for retrieval quality—it’s exactly the training I wish I had when I started. Link to the upcoming cohort: aipoweredsearch.com/live-course?pr…

English

Rustem S@vigosun·10 Kas

What is the main benefit of Distributed Data Parallel? It's the increased batch size for training! This leads to more stable gradients, faster data throughput, and reduced overall training time.

English

Rustem S@vigosun·27 Eki

One of the key learnings from using vector search in production RAG is that it produces far more false positives than lexical search.

English

Rustem S@vigosun·9 Eyl

vLLM's with pipeline parallelism is straight-up the de facto standard for local inferencing. No cap, it's running circles around the rest! It uses a new attention algorithm to deliver up to 24x higher throughput comparing to HuggingFace Transformers lib, all without requiring model changes. Also, when OpenAI OSS model came out. Google as a provider and a few others were giving degradated performance. Only because they had their own custom fork of VLLM, and they didn't put a fix in.

English

Rustem S@vigosun·15 Ağu

One reason I've found Claude Code hooks docs.anthropic.com/en/docs/claude… useful is that they bring stronger guarantee of execution compared to CLAUDE.md instructions. For example, I had issues when CC would inconsistently follow the instructions in CLAUDE.md to fetch recent project documentation. Moving this step to hooks solved it.

English

Keşfet

@benedictk__ @aiDotEngineer @swyx @treygrainger @softwaredoug @elonmusk @BarackObama @taylorswift13