Rustem S

199 posts

Rustem S

Rustem S

@vigosun

AWS Solution Architect and AI Engineer Transducing knowledge from the void.

Lausanne, Switzerland Katılım Mart 2009
2.3K Takip Edilen493 Takipçiler
Benedict Kerres
Benedict Kerres@benedictk__·
Ok hear me out - wine and codex. We can set this up. Vienna, Munich, Zurich - who be keen? We (that is OpenAI) take over a wine bar and invite you (codex / coding power users) to talk codex and coding.
English
234
8
606
38.3K
Rustem S
Rustem S@vigosun·
Just back from @aiDotEngineer in London feeling inspired and energised. Amazing speakers, great conversations, incredible vibe! Special thanks to @swyx and the team for putting it all together. SF AI World Fair is next!
English
0
0
1
56
Rustem S
Rustem S@vigosun·
Spec-Driven Development with LLMs introduces a new synchronization problem. We mostly solved code-to-test coverage. Now we need tooling that does the same across specs, code, and tests ensuring they don't drift apart.
English
0
0
0
72
Rustem S
Rustem S@vigosun·
RLMs haven’t hit their “CoT moment” yet. Chain-of-Thought was: 1. prompt trick 2. latent capability shows up 3. added into model training RLMs are at step 1–2 today.
English
5
1
75
31.3K
Rustem S
Rustem S@vigosun·
Just observed a massive improvement using playwright-cli vs playwright MCP for web app testing. github.com/microsoft/play… Here are the details that make it so effective. It shipped in Playwright v1.58.0 with ~35 commands vs ~20 in the original MCP. But the real win is the architecture: the CLI saves snapshots to disk instead of stuffing them into the LLM context using ARIA accessibility tree format. ~27K vs ~114K tokens for the same task. Could MCP do this too? Technically yes, but most MCP clients can't read files from disk. The protocol assumes data flows back to the model. The CLI was built for coding agents like Claude Code that can handle this natively.
English
2
0
1
186
Rustem S
Rustem S@vigosun·
Why RL over SFT? Many tasks have infinite valid solutions—whether it's booking a flight, writing code, or navigating a UI. You can't demonstrate every path, but you can easily verify success. That's RL's edge: reward the goal, not the route. And unlike SFT models that freeze when they deviate, RL models learn to recover and adapt.
English
0
0
0
94
Rustem S
Rustem S@vigosun·
The effective design of agentic apps relies on a complementary relationship where MCPs and Skills empower an agent to function intelligently. In this setup, MCPs serve as the secure interface or "hands" connecting the agent to external tools and data. Skills provide the "procedural knowledge" or "brain" that dictates how those tools should be used. Rather than competing, these components work in unison: an agent loads a Skill to learn the correct business process or standard operating procedure, which then guides it to execute actions precisely via the appropriate MCPs, ensuring workflows are both capable and correct.
English
0
0
0
70
Rustem S
Rustem S@vigosun·
Just finished AI-Powered Search course by @treygrainger & @softwaredoug. It was a transformative deep-dive into modern search. This course moves beyond basic vector search to cover the production-grade techniques necessary for robust RAG and Agentic systems. From Learning to Rank and user signals to advanced agentic workflows, the hands-on, platform-agnostic curriculum provides the tools to solve real-world retrieval problems. With expert guest lectures and a focus on self-improving systems, it offers the perfect blend of theory and code. Highly recommended for any engineer responsible for retrieval quality—it’s exactly the training I wish I had when I started. Link to the upcoming cohort: aipoweredsearch.com/live-course?pr…
English
0
0
0
63
Rustem S
Rustem S@vigosun·
What is the main benefit of Distributed Data Parallel? It's the increased batch size for training! This leads to more stable gradients, faster data throughput, and reduced overall training time.
English
0
0
0
50
Rustem S
Rustem S@vigosun·
One of the key learnings from using vector search in production RAG is that it produces far more false positives than lexical search.
English
0
0
0
54
Rustem S
Rustem S@vigosun·
vLLM's with pipeline parallelism is straight-up the de facto standard for local inferencing. No cap, it's running circles around the rest! It uses a new attention algorithm to deliver up to 24x higher throughput comparing to HuggingFace Transformers lib, all without requiring model changes. Also, when OpenAI OSS model came out. Google as a provider and a few others were giving degradated performance. Only because they had their own custom fork of VLLM, and they didn't put a fix in.
English
0
0
1
88
Rustem S
Rustem S@vigosun·
One reason I've found Claude Code hooks docs.anthropic.com/en/docs/claude… useful is that they bring stronger guarantee of execution compared to CLAUDE.md instructions. For example, I had issues when CC would inconsistently follow the instructions in CLAUDE.md to fetch recent project documentation. Moving this step to hooks solved it.
English
0
0
1
97