Takayuki Fukuda
32.1K posts

Takayuki Fukuda
@hedachi
Game Developer / Software Engineer / 株式会社ミリオンダウト代表取締役 / Ex-Electronic Arts / 生成AIでいろいろ作ってます。ハイライトからご覧ください


This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc. More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage: 1) raw text (hard/effortful to read) 2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default 3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default ...4,5,6,... n) interactive neural videos/simulations Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral x.com/zan2434/status… There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen. TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.

文章の難しいところをクリックすると簡単にしてくれるアプリを作ってみた

The American mind cannot comprehend this 12-piece sashimi and 10-piece sushi in Japan costing $3 and $2 respectively. And it genuinely tastes better than most restaurants back home.


This is almost as true in the US.


Unitree has just launched UNISTORE the humanoid robot version of the App Store The platform lets users create, develop, publish, purchase, and deploy robot actions, skills, and apps with one click, supporting all Unitree models. It’s still early stage: only the motion library is live (basic moves, dances, martial arts, etc.), all officially published and available as limited-time free downloads. The dataset module (which many are eager to see) is currently restricted to approved developers only. It’s unclear yet whether datasets will be tradable in the future.(what do you think?) Unitree is clearly building a full ecosystem platform. Currently available only in China, with the international version expected soon. unistore.unitree.com



SlackでClaudeとCodexと話すチャンネルを別に作ってたんだけど、混ぜたらいい感じになった 基本的に両方とも個別に調べて回答してくれるけど、もう一方の会話も見ていて、2人で相談してって言えばしてくれる




カタカナ表記の「コミュニケーション」って使用頻度高い言葉だから2トークンぐらいかなと思ってたけど、5トークンも食うとは







