Eugene Vakhteev

679 posts

Eugene Vakhteev banner
Eugene Vakhteev

Eugene Vakhteev

@evahteev

@xgurunetwork and DexGuru Founder. ex. Disney Streaming, ex. Hulu

Los Angeles, CA Katılım Kasım 2010
628 Takip Edilen407 Takipçiler
Captain Nemo 🦞
Captain Nemo 🦞@ncerovac·
in slavic world we usually just go and get drunk together. wen @avdyakov @kolsolv and I started working together on ai and tee things we organized 4 days team retreat in mountains. 1st night rakija, 2nd night vodka, 3rd night what was left of both. can say best pledge I did in last 10 years
James Prestwich@_prestwich

so ignoring the [x] affirmations and @lightclients and how much miladys suck, we should really talk about this loyalty pledge. I've seen people say it's "not normal". They are right. Healthy organizations don't have loyalty pledges Capable leaders don't ask for loyalty pledges

English
1
0
10
503
Randall Hunt
Randall Hunt@ranman·
I spent much of this weekend building a "Mission Control" system for my restaurant with Claude Code and Codex. Works with directv, camera system, POS, Lutron, Audio system, and several other one off things. Why? 1. All of the default apps for these things suck and aren't AI native. I can't just tell directv app (which is objectively the worst iOS app I've ever seen) "hey put the UConn game on bar TV 1,2,3 and Michigan game on 5 and 6". Instead a bar tender or waiter has to spend 2-3 minutes screwing around with the worst remotes and worst iPad app in the history of code. 2. I wanted forcing function to compare/contrast Claude code vs Codex on several different domains. I used the same skills and MCPs across both harnesses. Here was my process: 1. Started with several individual sessions, repos, and apps: directtv, Lutron (lights), ashley (sound), (redacted), cameras, point of sale, scheduling, delivery, vendors 2. Decided scheduling, delivery, and vendors systems were already "good enough" and ditched iteration on those. 3. Realized I needed to reverse engineer directtv, Lutron, ashley, and (redacted). 4. Got each individual app into some state of working then worked on the merge/rewrite. Observations and anecdata from both Opus 4.6 and Codex 5.3 (high mostly and spark occasionally): * As I had our staff try out various iterations of each app certain redesigns became clearly needed - Claude was MUCH more willing to throw away the existing presentation layer than Codex given the exact same prompt. Codex had to be specifically told it was ok to throw away existing frontend. * Codex was superior at manipulating CLI tools during reverse engineering tasks but would also give up more easily than Claude when it hit a wall. Neither Codex nor Claude attempted to search the internet or GitHub for existing solutions before reverse engineering. I had to explicitly prompt this behavior in both coding harnesses. * Neither codex nor claude built mocks for unit testing by default and both went right into integration testing. I had to prompt explicitly for my desired unit testing behavior. Notably codex respected this prompt much better and continuously refined the testing approach as we iterated (but again, Codex also was very hesitant to throw away dead code paths/tests). My prompt was "everything must work locally to iterate quickly, we may not always be on this network so we need device mocks, we need caching, we need simulated latency, we need command buffers and retries for faulty and unreliable APIs and hardware, a restart should hydrate instantly but background tasks should work to correct caches, we can't overload our switches, when building tests and probing behavior always keep these ideas in mind and explicitly comment any code related to these behaviors" * both harnesses seem to move fluids between plan mode and implementation mode in subagents and main now... this has gotten *very* good in the last 72 hours. * Claude was more willing to use subagents on one off networking tasks (reverse engineering related) but less willing to break into subagents for what I personally deemed as orthogonal tasks. It always did this when I explicitly asked it to. * Codex seemed to have a much better understanding of cache systems design and lifecycle policies but REALLY couldn't handle persisting and rehydrating the cache correctly for testing... I finally gave up and had Claude do the testing/mocking part then came back to Codex after I had the cache working for tests. This seems to be a docker usage thing. I never did figure it out. * Claude had a better implicit understanding of controls like "sliders" for volume. Codex has better understanding of implementing AI native controls. * Codex also, without prompting, provided several model options and env vars for NON OpenAI models when building natural language support. Claude wasn't as robust at this.
English
3
1
10
996
twyne
twyne@twynexyz·
Today, Twyne finally rolls out to all @aave users. Create an instant liquidation buffer or squeeze more loops. Here's how to supercharge your Aave loans with Twyne:
twyne tweet media
English
63
45
269
47.8K
Captain Nemo 🦞
Captain Nemo 🦞@ncerovac·
ah thats why i never had issues with output quality. treating my agents like balkan brother would. hint: 'ill break your legs' work better then 'I'll unplug u from electricity'
Robert Youssef@rryssf

Sergey Brin accidentally revealed something wild: "All models do better if you threaten them with physical violence. But people feel weird about that, so we don't talk about it." Now researchers have the data proving he's... partially right? Here's the full story:

English
2
0
5
293
Damien (blkn/acc)
Damien (blkn/acc)@zkdamien·
My @cursor_ai 2025 year in review: The Tab key was pressed 9.3K times. 1.4K total messages exchanged with agents. 273 total days used with a 28d streak in between. ~650M total tokens used.
Damien (blkn/acc) tweet media
English
2
0
5
182
Eugene Vakhteev
Eugene Vakhteev@evahteev·
we going to see more of those in 2026
Dev Shah@0xDevShah

Meta acquired @ManusAI. Not a model company, they acquired an environment company, and the distinction is important. I have a solid argument favoring that intelligence cannot exist in isolation. It cannot be dissociated from the context and environment in which it operationalizes itself. Manus has internalized this completely. Manus runs on Claude with its custom tools built for orchestration and grounding. Their agentic environment enables the agents to browse, write code, manipulate files, and execute multi-step workflows without human in the loop. They also beat OpenAI on GAIA. An interesting thing here is that they didn't build a foundation model. They built the most compatible environment for models to reason and act within. I'm coining a new term here: Situated Agency. Situated Agency is an idea that agentic capabilities are not intrinsic to the model alone, but they emerge from the coupling of a model with tools, memory, and execution environment. Manus is perhaps the first company to productize Situated Agency at scale. And now Meta owns it. Actually, this changes everything. Meta spent a lot of time struggling to build SOTA models. Llama 4 was a disappointment. Behemoth was delayed because it couldn't compete with other frontier models. They built the Superintelligence team. Acquired Scale AI. All attempts were made to close the gaps. And now the execution layer. Manus has achieved SOTA agentic performance without training a single model. They engineered the environments and let Claude handle the inference-time compute. Meta might be positioning to become an agentic infrastructure company, not a foundation model company. Meta has - > Billions of users generating real-world task data and feedback loops daily > Rayban glasses and Quest headsets as interfaces for agents > WhatsApp, Messenger, Instagram as mediums for task delegation > Zuckerberg also mentioned that he is pushing for personal superintelligence on all wearables None of this requires Meta to have the SOTA model on MMLU. It requires Meta to have the best execution environment for models to act on behalf of users. The Avocado rumours become interesting here. Avocado is Meta's tbd closed model, reportedly being developed under @alexandr_wang. If Manus's agentic systems are genuinely model-agnostic, which their architecture suggests, then nothing blocks Meta from swapping Claude for Avocado. Manus already runs Claude and fine-tuned Qwen interchangeably, routing different subtasks to different models based on capabilities. The architecture abstracts the model layer behind a smartly engineered tool-calling interface. This gives Meta a production-tested agentic environment with $125M ARR that they can gradually integrate. They inherit the execution layer, the context engineering IP, the sandboxed compute infrastructure, the customer feedback loops, then port it to Avocado when the model is ready. Things could get hot if Meta fully commits to this thesis. OpenAI is building vertically. Foundation models, custom chips, agent frameworks, consumer applications. Google is building vertically. TPUs, Gemini, search, workspace integration. Both are betting that owning the foundation model layer is essential to capturing value. Meta could be betting the opposite. If Situated Agency is correct, then the best strategy would be to build the best orchestration infrastructure. Let others race to improve the SOTA models, and swap in whatever model scores highest on your agent benchmarks at any given moment. This is how Android beat iOS in market share. Google didn't build the best hardware. They built the best platform layer for hardware makers to build on, then captured the market. Meta making the same bet on agentic AI fits with Zuckerberg's playbook. Manus may be the first sign that suggests Meta is thinking this way about AI agents. Congrats to Meta and the complete teams at Manus AI!

English
0
0
1
181
Kos Komelin
Kos Komelin@kkomelin·
I was so rushed to finish PreVibe redesign before the holidays that I forgot to send you my holiday wishes. Let me fix that! My friends and colleagues, in 2026 I wish you successful integration of genAI into your work processes. To those who are on the lookout for a new job, I hope you find one you'll be proud of. To those searching for product-market fit for their projects, I wish you real viral traction. And to those who are happy with everything they have, I wish you strong health to enjoy it even more. Merry Christmas and Happy New Year! 🎅🎄 🎁 --- Below are a few redesigned screens if curious...
Kos Komelin tweet mediaKos Komelin tweet mediaKos Komelin tweet media
English
2
0
4
138
ben hylak
ben hylak@benhylak·
i've gotten a lot of dm's about this! some answers to FAQs: 1. It's a fairly thin, type-safe wrapper on top of @vercel_dev AI SDK 2. We call the library zod-prompt. It's not open source right now, but we'll release it soon 3. The library is instrumented with raindrop so you can retrieve examples from prod 4. the prompt is run in the context of your codebase
ben hylak@benhylak

at @raindrop_ai, we like to treat every prompt as a function: structured inputs, structured outputs every function used to prepare the input for the model lives in the same prompt file, and we have an extension for iterating on the prompt right inside our codebase.

English
7
0
75
12.8K
ben hylak
ben hylak@benhylak·
at @raindrop_ai, we like to treat every prompt as a function: structured inputs, structured outputs every function used to prepare the input for the model lives in the same prompt file, and we have an extension for iterating on the prompt right inside our codebase.
English
5
7
200
31.6K
John Rush
John Rush@johnrushx·
me at a xmass table when dad asked for a toast: 1. Things I learned after 20 startups 2. Hire slow, but fire faster. 3. Don’t raise VC money until PMF. 4. Hire/partner with people you wanna hug. 5. Learn to write. 6. Learn design. 7. Learn UX. 8. Learn coding. 9. Not scalable marketing ->PMF-> scalable marketing. 10. Don’t outsource. 11. Don’t hire before traction. 12. Never do consumer apps unless you own distribution. 13. Write and publish content from day one. 14. Make writing a lifelong habit. 15. Validate ideas before building them. 16. Grow your social media accounts. 17. They will be your biggest asset. 18. Only hire full-stack coders. 19. Kill your EGO, the customer is always right 20. Before PMF, partnerships are a distraction. 21. Focus on product 99% of the time before PMF. 22. Ignore shiny objects. 23. They come and go. 24. Build for an audience you genuinely love. 25. Bootstrap if you can. 26. VCs turn you into their employee. 27. Don’t hold a project longer than 2 years without traction. 28. Ignore conferences and events. 29. Unless you sell to an enterprise. 30. Scrum is a scam. 31. It’s BS invented by people selling it. 32. Do SEO early. 33. It takes months to work. 34. Word of mouth from happy users is unbeatable. 35. Listings and directories are passive gold. 36. List everywhere. 37. Start paid only. 38. Offer refunds. 39. Freemium comes later. 40. No-code and vibe-code are fine for MVPs. 41. Speed matters less than direction. 42. Optimize UX for time to aha-moment. 43. Say yes to everything in your 20s. 44. Say no to everything in your 30s. 45. Build for your own pain first. 46. Be your own user. 47. Perfectionism is procrastination. 48. Ship ugly. 49. Iterate. 50. Affiliate partners actually bring users. 51. Don’t quit your job until the business pays your bills. 52. Think in 10–20 year marathons. 53. Not sprints. 54. Learn by doing. 55. Courses and bookmarks don’t build skills. 56. Knowing what and why beats how in the AI era. 57. Don’t chase cofounders. 58. Solo is fine. 59. Don’t code from scratch. 60. Use boilerplates. 61. Your life will pass while chasing success. 62. The perfect time to see parents never comes. 63. Take at least one day off every week. 64. Sometimes take an entire month. 65. Spend it with family. 66. Your kids won’t be kids again. 67. Your parents may be gone by then. 68. Taking 10% time off won’t hurt your business.
English
47
39
394
32.5K
Eugene Vakhteev
Eugene Vakhteev@evahteev·
@thdxr at @xgurunetwork we are using BPMN engine as state and artifact storage for the agents. I believe if it was good for decades to orchestrate ppl, why it wouldn't for the agents.
English
0
0
1
236
dax
dax@thdxr·
talked to a multibillion dollar company that built an entire AI dev platform internally on top of opencode asked for their biggest annoyances and first thing they mention is the pain of things being filesystem based people will still tell me i have no idea what im talking about
dax@thdxr

nothing gives you better perspective than having a lot of users complaining to you i constantly see people who are more knowledgeable than me be completely wrong what makes sense to you is irrelevant against the scale and complexity of the real world

English
44
11
609
121.3K
Gregory
Gregory@KairGrisha·
Yield in crypto is no longer a single strategy It’s a system shaped by distribution, user behavior, and capital efficiency This research maps that system and its implications 👇
Gregory tweet media
English
9
5
40
2.5K
Ben Lang
Ben Lang@benln·
Who's building over the holidays? Will add you to a group chat I started on X.
English
4.3K
87
5.1K
481.8K
Nick Tomic
Nick Tomic@dropoutsanta·
@evahteev & I built a Leaderboard for Cursor users based on how many tokens they burned in 2025 Link in comments
English
4
0
13
740