Progen

22.1K posts

Progen

@imcharleslo

A Pissed Right Off Genetically Engineered Nerd

Singapore Katılım Nisan 2011

6.2K Takip Edilen925 Takipçiler

Sabitlenmiş Tweet

Progen@imcharleslo·13 Eki

A Proof-of-Concept of an AI Assistant Designer using #UnrealEngine's #Metahuman, #stablediffusion, #OpenAI's Whisper and #GPT3.

English

345

1.5K

Progen retweetledi

RepoGems@RepoGems·4 Nis

Open Source Voice Agent Platform github.com/dograh-hq/dogr…

English

Progen retweetledi

Ethan Buntario@productbun·2d

I got tired of this: • Drag file into browser tab • Upload to God-knows-what cloud • Wait for it to process • Share a link that breaks when you edit So I built a CLI that turns local files into live URLs in 280ms.

English

Progen retweetledi

Wietse Buwalda@WietseBuwalda·8 Kas

Phenonal situational awareness picture. There is a significant gap in the market for a system that combines the likes of MarineTraffic and FlightRadar24 together into one picture. Coupling it with ENCs in S57 or newer S100 charts and Aero charts would be icing on cake

Chris Dalke@chris_dalke

I didn't build this user interface to watch planes fly over me on ADSB but it sure makes good test data

English

3.9K

Progen retweetledi

Latte@0xbisc·2d

Grace Dive made with GPT Image 2 + Seedance 2 #DreaminaCPP @dreamina_ai Prompt below ↓

English

214

2.3K

172.9K

Progen retweetledi

Vincent Logic | 信号＞噪音@VincentLogic·1d

发现个动作捕捉领域的炸裂AI工具！ MoCap Anything V2 —— 从普通视频直接提取动作数据，人、动物、甚至奇怪生物都能搞定！看演示效果真的顶： ✅ 老鹰展翅飞翔、猎豹奔跑、蛇类扭动 ✅仓鼠跳舞、鸵鸟走路、鳄鱼爬行 ✅游戏角色、卡通形象、各种 3D 模型核心突破：传统动捕分两步走：先识别关节位置 → 再用逆向运动学转成骨骼旋转。问题就出在第二步，传统方式不可学习，经常关节乱扭、骨骼抽搐、动作抖动。 V2 直接上了端到端 AI 模型： ✅从视频直接理解动作 ✅一步输出最终骨骼旋转结果 ✅不再拆分多个独立步骤 ✅AI 自己纠错，针对最终动画效果优化 V1 vs V2 对比： ✅V1 经常骨骼抖动、关节乱转 ✅V2 稳定性拉满，动作自然精准应用场景：动画制作、游戏开发、影视特效、机器人训练... 以前需要专业设备、动捕棚、复杂清洗流程，现在一段参考视频就能驱动任何 3D 角色。搞 3D 动画、游戏开发的兄弟，这个必须试试！项目地址放评论区了👇

中文

121

7.8K

Progen retweetledi

Kevin Lin@KevinQHLin·1d

🌟Introducing🎻Violin — an Open-source Video Translation Skill. 📹Video is the dominant medium on the internet, yet most high-quality content (lecture, talk, podcast) is locked behind a single language, leaving global audiences behind. So we built Violin: a video skill that combines speech recognition, LLM translation, and speech synthesis into one seamless pipeline. 🌐 Demo: violin-ai.com 📝 Blog: together.ai/blog/violin-op… 🔗 GitHub: github.com/shang-zhu/viol… ✨Key Features: 🎙️High-quality multilingual ASR & Translation & TTS. 🗣️Personalize translation & voice (turn an academic talk into something children can follow). 💬Chat with the video — ask any questions grounded in the video. 🧩Support Web app, CLI, and Agent skill 🍃Fully open-source under MIT. ❤️Built with the wonderful @ShangZhu18 and advised by @james_y_zou ! All features powered by @togethercompute . Try it and let us know what you think! 🎻

English

130

610

113.1K

Progen retweetledi

阿蔺A-Lin@alin_zone·2d

卧槽，这是我见过最保姆的软路由教程了，操作截图多，而且居然截图背景的边距都是一样的，强迫症友好 ⚠️重点来了，软路由搞好之后，家里所有的网络都走这个，配置好住宅ip之类的，再也不用担心我的 Claude Code 因为网络问题封号了

半格 / HalfBit@justhalfbit

x.com/i/article/2054…

中文

133

516

100K

Progen retweetledi

Tom Huang@tuturetom·2d

正式开源 html-anything 🚀 1:1 让你感受全网爆火 Claude code 作者提的 HTML 效果！你的 Agent 现在可以将任何数据转为世界级设计水准的 HTML 🔥 历时 3 天，1万五千行代码！支持 75 套 Skills，9 种导出格式，支持所有的 code agent，包括 claude code、codex、openclaw、hermes 等💥地址见评论区

中文

112

491

2.8K

463.2K

Progen retweetledi

CopyRebeldia@CopyRebeldia·1d

Hoy una industria entera dejó de tener sentido. Un tío publicó en GitHub un repo que convierte cualquier foto en un mundo 3D explorable: meshes con físicas, splat del fondo, audio ambiente. Todo. Una imagen entra. Un mundo sale. Cinco minutos. La gente que se pasó diez años aprendiendo Blender lleva todo el día mirando esto en silencio. Se llama image-blaster.

Español

221

2.2K

14.2K

937.5K

Progen@imcharleslo·1d

@jaezun_ Is that Claude's AgentSDK?

English

Progen retweetledi

jayson@jaezun_·1d

Day 5: introducing agent birthing find me a better agent creation UX and I will buy you coffee

English

1.2K

123.9K

Progen retweetledi

Bilawal Sidhu@bilawalsidhu·1d

Semantically annotating 3D gaussian splats on the fly using gemini 3.1 + sparkjs 1. Load any 3D scene and hit scan 2. Get 2D detections from VLM 3. Cluster outputs & project into 3D world space 4. Save as a persistent 3D semantic layer Inspired by @alexanderchen's experiments with gemini visual intelligence. Just had to try to lift it from 2D to 3D!

English

107

891

48.4K

Progen retweetledi

Haoyi Zhu@HaoyiZhu·1d

🤩Excited to share SANA-WM: a 2.6B open-source world model for minute-scale 720p video generation. Given one image + text + a 6-DoF camera trajectory, it synthesizes action-controllable 60s worlds on a single GPU. Project: nvlabs.github.io/Sana/WM/ Paper: huggingface.co/papers/2605.15…

English

120

923

102.5K

Progen retweetledi

阿西_出海@axichuhai·2d

又发现一个变态开源项目，叫agency-agents 这哥们儿把世界上几乎所有职位都做成了 AI 员工，包括：前端开发、UI 设计、自媒体运营、销售、市场分析师、数据工程师、法务顾问…… 现在已经有 144 个 AI 员工，还在持续加。 GitHub 星标直接冲到 6 万+，完全免费开源。在小龙虾或者 Claude code 里，几分钟就能跑起来一整个虚拟团队。你需要哪个工种，直接调哪个 AI 员工，随时开干。适合「需要某个专项能力，但又不值得专门招人」的情况，比如独立开发者、一人创业者、小团队

阿西_出海@axichuhai

卧槽，上周我因为没有及时看消息，错过一笔1w+的订单😂 消息在对话框里躺了两天，等我看到的时候，对方已经找了别人。我发现聊天记录的管理，无论是对企业还是个人，都是很重要的。但客户沟通记录、团队讨论的决策、项目推进的细节，全散落在各种对话框里。大多数时候，这些信息要么被遗忘，要么要花时间手动整理。这两天我发现有个平台叫 @TankaChat ，是个 AI-native 的团队协作工具，可以把你的对话、文件、决策持续沉淀成 AI 记忆，让 Agent 帮我管理聊天记录，用完我发现 AI 真的可以开始帮我干活了。比如 Tanka 能帮我自动整理会议纪要，一般我在讨论完事情之后，都需要自己去整理总结，现在 Tanka 能够自动完成。而且它还可以帮我整理待办事项，有时我会忘记跟别人对话里的todo，它可以帮我避免遗忘事项。另外，我还发现它可以关联 Telegram 和 WhatsApp 平台，这让我挺惊讶的。它能读取这些平台里的信息，帮我管理一些重要的联系人并提醒我优先回复。如果我上个月就有这个工具的话，那笔订单就不会丢了。顺便说一句，现在还可以免费领 1 个月 Plus Plan，感兴趣可以试试：t.tanka.ai/campaign/59122 官网地址：tanka.ai/slack

中文

262

1.1K

262.2K

Progen retweetledi

Rohan@lets_dig_deeper·2d

an audio model that can switch languages, accents, tone, gender, emotions in a single instance this is silk mulberry 1.5 our most cost efficient model! by @rumik_ai research lab

English

626

52.5K

Progen retweetledi

Dev Shah@0xDevShah·1d

We're releasing a whole new category of voice models. Introducing DramaBox — our state-of-the-art, open source voice model built for cinematic use cases. Traditional TTS gives you a voice. DramaBox by @resembleai gives you a performance. For too long, Voice AI has been stuck in "robotic assistant" mode. If you wanted dramatic emotion, sighs, or a voice cracking with grief, you had to hire an actor or spend hours editing. We fixed that.

English

411

32.5K

Progen retweetledi

Berryxia.AI@berryxia·1d

凌晨刷到这条，我脊背瞬间发凉，全身鸡皮疙瘩都起来了。 @zcbenz，MLX维护者、Electron.js创始人，在Apple亲手把这个消息放了出来： MLX的CUDA后端，所有测试全部通过！那个曾经被当成“苹果硅独占玩具”的MLX，现在直接杀进了NVIDIA的主场。同一套代码。 Mac上极致丝滑，NVIDIA显卡上也全速狂飙。以前大家还在PyTorch的兼容地狱里挣扎，Apple用MLX悄无声息打出一记王炸。本地AI的跨平台时代，真的要来了。而且来得比所有人想象的都要猛、都要狠。我现在只剩下一个感觉——血脉喷张。 MLX的CUDA时代，正式拉开序幕。你敢信！

Cheng@zcbenz

We have achieved a milestone in MLX that all tests are passing in CUDA backend now.

中文

525

164.3K

Progen retweetledi

Jason Ginsberg@JasonBud·1d

Grok Build is a fully interactive CLI, which means you can actually use your mouse to click. No flickers. Especially useful as I find myself running 5+ agents at a time and jumping between plans.

xAI@xai

An early beta of Grok Build, an agentic CLI for coding, building apps, and automating workflows is now available for SuperGrok Heavy subscribers. Through this early beta, we will improve the model and product based on your feedback. Try it at x.ai/cli

English

209

173

1.8K

11M

Progen retweetledi

Sac@Saccc_c·2d

受全网浏览近千万的 3D 生物结构视觉启发，我制作了三星堆 3D 文物展览我认为历史文物可视化是有着巨大商业价值的，因为当下国内博物馆制作的 3D 参观依然一言难尽制作的方法也非常简单： 1、三星堆博物馆官网截图文物后让Image 2.0生成清晰的正视图，然后直接在 Tripo 中生成 3D 图像。（Tripo我是在闲鱼买的会员） 2、让Claude Code 复刻黄佬（@servasyy_ai）开源的仓库，但我需要的是博物馆 3D 浏览视图，内容和风格设计均参考三星堆博物馆官网以及我下载的 3D 图像黄佬仓库：github.com/huangserva/3DC… 下面请大家沉浸式参观：

中文

188

1.1K

81.8K

Progen retweetledi

Creative.Edge CL+@commonstyle·2d

検証： OpenAI Codex 物理的推論の性能チェック（人間がプロンプトを書いたり編集することが、ほぼ不可能なため） Codex による試行錯誤の結果を Seedance 2.0 で映像生成 ※正確性を求めず、それっぽい(つまり巧みな嘘)クリエイティビティに計算リソースを使った例 OpenAI Codex, Seedance 2.0

日本語

281

26.9K

Keşfet

@dreamina_ai @ShangZhu18 @james_y_zou @togethercompute @jaezun_ @alexanderchen @rumik_ai @resembleai