PixelWizard

2K posts

PixelWizard

@pixel2Wizard

Coding spells that bring pixels to life.

Katılım Aralık 2011

452 Takip Edilen47 Takipçiler

PixelWizard@pixel2Wizard·23 Nis

@MilksandMatcha Spot on. Multi-agent workflows > prompt engineering. Building an AI knowledge tool and struggling to balance instant ingestion with deep reasoning. Spark for fast triggers + Codex for orchestration is the perfect stack for a seamless local-first experience.

English

Sarah Chieng@MilksandMatcha·23 Nis

Giving away 5 more Codex Pro plans for folks to try out multi-agent workflows with Codex and Codex Spark Each person will get 3 months of free Codex Pro (highest tier). Winners will be selected from comments in 48 hours, comment below why you want it.

Sarah Chieng@MilksandMatcha

x.com/i/article/2044…

English

891

151K

PixelWizard@pixel2Wizard·22 Nis

@sudoingX good job.

English

Sudo su@sudoingX·21 Nis

this is what I've been building for hermes agent. millions of you are paying for X Premium+ and sitting on grok-4 access you never actually use. i spent 20+ days turning that dead subscription into a live coding agent in hermes agent. native grok provider logs into grok. com with your X OAuth and hermes agent sends every prompt through the browser. first few days were cookie attempts. raw curl, curl_cffi with tls impersonation, grokproxy, every open source variant. all died on cloudflare 403s. endpointspecific attestation tokens only the app's own js can compute. no cookie alone gets through. moved to camofox. page loaded, cloudflare passed, but x detected the browser fingerprint and locked my account on login. injected session cookies from firefox, session authenticated, web app silently refused to fire api calls. zero network traffic. vercel botid tripped on page load. patchright cracked it. patched playwright that fixes cdp runtime.enable detection via isolated worlds, chromium based, x oauth login works. grok finally responded through hermes. full pipe live. what works today: chat persistence across tool loops, mode switching (auto fast expert 4.3 heavy mirrors grok. com dropdown), multi-step reasoning, tool calling on real hardware, all 28+ hermes tools. what's not reliable yet: grok drifts to knowledge answers instead of emitting tool calls for state queries like ~/.bashrc contents or is-firefox-running. stops autonomously exploring after the first tool result instead of chaining until the picture is complete. branch is live: github.com/sudoingX/herme… anyone on X Premium+ with patchright installed can fork, install the branch, point hermes at grok. com and if you have a prompt engineering idea for the reliability gap, PRs on my fork welcome. more eyes, better outcome. replied to @Teknium earlier with the same state. his brain on the layer would be gold. bigger unlock would be if @xai offered a maintainer level auth path instead of us reverse-engineering the web UI, this goes from beta to production overnight.

Sudo su@sudoingX

spent 20+ days on something i wanted all of you to have on hermes agent. it works. but it doesn't fly yet. and i want it to fly. while building, my reach tanked. my timeline went dark. the thing keeping this going is content and i chose to build instead of post. that's the trade i made. now i think some of you goats could help me get this across the finish line and make it accessible to everyone. i need to focus on timeline so nikita keeps paying bucks and the chain keeps going. more soon.

English

1.1K

89.5K

PixelWizard@pixel2Wizard·18 Nis

@_heyrico 很普通

日本語

293

rico@_heyrico·18 Nis

Spent 3 hours with claude design to generate this awesome design Prompt below 👇

Claude@claudeai

Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude. Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.

English

330

62.9K

PixelWizard@pixel2Wizard·13 Nis

@emanueledpt Why the fuck does it have to keep pairing in such a complicated way? Can’t we just set up a key/token like GitHub does?

English

121

Emanuele Di Pietro@emanueledpt·13 Nis

Pairing with Code is coming to Remodex! This has been a super requested feature So now if you use the terminal on your phone or can’t scan the QR you can use the code that will appear in your terminal Coming soon!

Emanuele Di Pietro@emanueledpt

/feedback command will come to Remodex! 💬 I decided to add this slash command for simple reasons: → Easier way to reach out to me → Easier way to share bugs/feedbacks → Better experience and reliability of the app itself I hope this will be a nice touch And I hope you won't need to use this button as much lol

English

8.1K

PixelWizard@pixel2Wizard·4 Nis

@imolingcn @shynloc 哈啊哈哈哈，只要用过的就知道，跟kimi都差很多

日本語

AI产品康Sir@imolingcn·2 Nis

@shynloc 最近 CC 到期，临时用了下 MiniMax code plan，跟个大傻子一样，不知道怎么吹到那么高的高度

中文

2.2K

ShyNloc@shynloc·2 Nis

有办公需求的可以看看这套MiniMax的Skill，Claude Code和Codex包括小龙虾OpenClaw都可以用。质量高，比网友手搓的不知道高到哪里去了。 github.com/MiniMax-AI/ski…

中文

451

48.8K

PixelWizard@pixel2Wizard·27 Mar

@aiedge_ 这篇指南直接把「怎么建、怎么优化、Skills 2.0 新功能（Evals + A/B测试 + 自动触发）」全干货甩出来了，附真实工作流模板，照着做就能起飞 #ClaudeSkills #AI生产力 #Claude2026

AI Edge@aiedge_

x.com/i/article/2034…

中文

PixelWizard retweetledi

Avi Chawla@_avichawla·25 Mar

x.com/i/article/2036…

ZXX

243

63.7K

PixelWizard retweetledi

Griffin Hilly@GriffinHilly·24 Mar

x.com/i/article/2036…

ZXX

226

113.4K

PixelWizard retweetledi

Manthan Gupta@manthanguptaa·20 Mar

x.com/i/article/2034…

ZXX

109

949

523.6K

PixelWizard@pixel2Wizard·18 Mar

@hylarucoder xhigh没有太大必要，high的推理深度完全够了，openclaw的作者也是推荐high就行，只有high解决不了的问题再尝试换xhigh

中文

395

海拉鲁编程客@hylarucoder·17 Mar

codex 中的 gpt 5.4 xhigh 比较难控经常往大而全的方向思考，建议先让他写一个 plan md 出来，然后切到 high 质问「看看是不是复杂了，还是每一步都必不可少的」

中文

16.5K

PixelWizard@pixel2Wizard·17 Mar

@emanueledpt Good job. btw, has the version supporting subagent been released?

English

Emanuele Di Pietro@emanueledpt·17 Mar

1 week after the launch of Codex Remote Control for iOS Stats: → 76 forks → 1176 stars → 1719 installs All this in a week. Numbers I could only ever dreamed of achieving a few months ago. Everything is possible if you put in the work. Just do things. Just build things.

Emanuele Di Pietro@emanueledpt

4 days after the launch of Codex Remote Control for iOS Stats: → 52 forks → 740 stars → 1174 installs I cannot comprehend that I managed to get these numbers in only 4 days I'm beyond speachless Can't thank you enough for the support Wouldn't have done it without you ❤️

English

3.9K

PixelWizard@pixel2Wizard·12 Mar

模型是引擎，Harness 是变速箱 + 底盘 + 方向盘。引擎再牛，没好底盘也跑不远；底盘好，换个新引擎立刻起飞。

Viv@Vtrivedy10

x.com/i/article/2031…

中文

PixelWizard@pixel2Wizard·12 Mar

@derrickcchoi @grok 总结一下

中文

Derrick Choi@derrickcchoi·9 Mar

x.com/i/article/2030…

ZXX

165

1.4K

270.3K

PixelWizard@pixel2Wizard·12 Mar

@gabrielvaldivia 总结一下，给出可以执行的action @grok

中文

136

Gabriel Valdivia@gabrielvaldivia·11 Mar

x.com/i/article/2031…

ZXX

583

75K

PixelWizard@pixel2Wizard·12 Mar

@djfarrelly @grok 总结一下

中文

340

Dan Farrelly | Inngest.com@djfarrelly·12 Mar

x.com/i/article/2031…

ZXX

354

59.5K

PixelWizard@pixel2Wizard·11 Mar

@snwy_me This terminal theme looks quite nice. Could you please share it with me?

English

snwy@snwy_me·10 Mar

autoresearch really interested me, despite me not being "all-in" on agents yet. i wanted to get started with running auto experiments i looked to existing tools to serve as a harness but each one had its problems. so i made one introducing Helios for autonomous ML research

Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English

1.5K

185.6K

PixelWizard@pixel2Wizard·11 Mar

@AdarshSingh_in how do that？

English

249

Adarsh@AdarshSingh_in·10 Mar

I love design.

English

435

18K

PixelWizard@pixel2Wizard·11 Mar

@_heyrico @grok 确认并总结一下，给出可以成功复制执行的action

中文

101

rico@_heyrico·9 Mar

x.com/i/article/2030…

ZXX

132

1.5K

321.5K

PixelWizard@pixel2Wizard·11 Mar

@tonykipkemboi @grok 总结一下

中文

Tony Kipkemboi@tonykipkemboi·2 Mar

x.com/i/article/2028…

ZXX

436

158.6K

PixelWizard@pixel2Wizard·11 Mar

@agent_wrapper @agent_wrapper By default, AO will start an agent orchestrator session to orchestrate agents. After connecting with openclaw, will there be a conflict of two "brains"? Is it better to control through openclaw uniformly?

English

prateek@agent_wrapper·11 Mar

the 𝗱𝗲𝗰𝗲𝗻𝘁𝗿𝗮𝗹𝗶𝘇𝗲𝗱 self-improving AI system that builds itself 𝗱𝗲𝗺𝗼𝗰𝗿𝗮𝘁𝗶𝗰𝗮𝗹𝗹𝘆. i ran 𝟭𝟯 𝗽𝗮𝗿𝗮𝗹𝗹𝗲𝗹 𝗮𝗴𝗲𝗻𝘁 𝘀𝗲𝘀𝘀𝗶𝗼𝗻𝘀. they built their own feedback routing, convergence detection, and fork governance. from my phone. on Telegram. 𝟰,𝟬𝟬𝟬+ stars. 𝟰𝟳𝟳 forks. 𝟮𝟬 plugins. 19 days since launch. full article ↓

English

Keşfet

@MilksandMatcha @sudoingX @Teknium @xai @_heyrico @emanueledpt @imolingcn @shynloc