Robert Tercek

5.4K posts

Robert Tercek

@Superplex

I'm the author of Vaporized. Spent 20 years inventing the future of games, TV, education, mobile. Now I help companies make the transition to digital domain.

Los Angeles and the world Katılım Mart 2009

686 Takip Edilen3.4K Takipçiler

Robert Tercek retweetledi

Bilawal Sidhu@bilawalsidhu·18 Şub

I’ve used the Apple Vision Pro for 2 weeks now and here are my unfiltered thoughts — you might even call it a hot take 🌶️ 😅 Overall: I'm blown away, absolutely hyped... but also? Frustrated. Why is Apple making it SO HARD to tap into the existing VR media scene? There is a plethora of VR 180 and 360 content out there. And they've got this sweet immersive video player tucked away in Apple TV, with that signature Apple polish. Great I can watch an Alicia Keys video. YouTubeVR has millions of more videos I can’t currently watch. So I try to drop manually converted spatial videos directly into the Mac Photos app – nope! Gotta jump through iCloud hoops to retain the metadata? It's ridiculous. Almost feels like Apple wants to gate keep ALL immersive content on the headset. Ok fine but you can shoot stuff on iPhone spatial videos or on the headset itself. Awesome, but to edit it you need to use a 3rd party converter to turn it back into well adopted VR media formats to edit in Adobe or Resolve lol. There are killer ARKit apps, it might’ve made sense to get some of those ported over! Missed layup. And don’t even get me started on the broken webXR support. 3D websites like Luma AI and Polycam should’ve been immersive 3d on day 1. It feels like they just shipped an unfinished product. Which might be the case, considering how much of that mind-blowing WWDC 2023 stuff is STILL missing. Remember those awesome shared SharePlay experiences they demoed? Poof, gone. Even simple stuff, like shared spatial anchors that ARKit already supports for multiplayer AR, is nowhere to be found. I mean they can pass through reality with imperceptible latency, but I can’t have shared experiences with users who also have Apple Vision Pro in the room, and I am relegated to iPads on screens? Not much better than zoom and a massive underutilization of what this hardware is capable of in terms of realistic co-presence and remote collaboration. Look, maybe this is all growing pains. Maybe they shipped with 20% of the roadmap ready, and we'll just have to wait. But right now, the Apple Vision Pro feels like this super shiny walled garden in the middle of a sprawling VR playground. Here's hoping they open the gates soon... 🔐 #AppleVisionPro #MetaQuest3 #frustratedfanboy

English

370

90.2K

Robert Tercek retweetledi

Jim Fan@DrJimFan·16 Şub

Apparently some folks don't get "data-driven physics engine", so let me clarify. Sora is an end-to-end, diffusion transformer model. It inputs text/image and outputs video pixels directly. Sora learns a physics engine implicitly in the neural parameters by gradient descent through massive amounts of videos. Sora is a learnable simulator, or "world model". Of course it does not call UE5 explicitly in the loop, but it's possible that UE5-generated (text, video) pairs are added as synthetic data to the training set.

English

755

191.6K

Robert Tercek retweetledi

Jim Fan@DrJimFan·16 Şub

I see some vocal objections: "Sora is not learning physics, it's just manipulating pixels in 2D". I respectfully disagree with this reductionist view. It's similar to saying "GPT-4 doesn't learn coding, it's just sampling strings". Well, what transformers do is just manipulating a sequence of integers (token IDs). What neural networks do is just manipulating floating numbers. That's not the right argument. Sora's soft physics simulation is an *emergent property* as you scale up text2video training massively. - GPT-4 must learn some form of syntax, semantics, and data structures internally in order to generate executable Python code. GPT-4 does not store Python syntax trees explicitly. - Very similarly, Sora must learn some *implicit* forms of text-to-3D, 3D transformations, ray-traced rendering, and physical rules in order to model the video pixels as accurately as possible. It has to learn concepts of a game engine to satisfy the objective. - If we don't consider interactions, UE5 is a (very sophisticated) process that generates video pixels. Sora is also a process that generates video pixels, but based on end-to-end transformers. They are on the same level of abstraction. - The difference is that UE5 is hand-crafted and precise, but Sora is purely learned through data and "intuitive". Will Sora replace game engine devs? Absolutely not. Its emergent physics understanding is fragile and far from perfect. It still heavily hallucinates things that are incompatible with our physical common sense. It does not yet have a good grasp of object interactions - see the uncanny mistake in the video below. Sora is the GPT-3 moment. Back in 2020, GPT-3 was a pretty bad model that required heavy prompt engineering and babysitting. But it was the first compelling demonstration of in-context learning as an emergent property. Don't fixate on the imperfections of GPT-3. Think about extrapolations to GPT-4 in the near future.

English

231

433

2.7K

990.7K

Robert Tercek retweetledi

Jim Fan@DrJimFan·15 Şub

If you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths. I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be! Let's breakdown the following video. Prompt: "Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." - The simulator instantiates two exquisite 3D assets: pirate ships with different decorations. Sora has to solve text-to-3D implicitly in its latent space. - The 3D objects are consistently animated as they sail and avoid each other's paths. - Fluid dynamics of the coffee, even the foams that form around the ships. Fluid simulation is an entire sub-field of computer graphics, which traditionally requires very complex algorithms and equations. - Photorealism, almost like rendering with raytracing. - The simulator takes into account the small size of the cup compared to oceans, and applies tilt-shift photography to give a "minuscule" vibe. - The semantics of the scene does not exist in the real world, but the engine still implements the correct physical rules that we expect. Next up: add more modalities and conditioning, then we have a full data-driven UE that will replace all the hand-engineered graphics pipelines. openai.com/sora

English

540

2.6K

12.9K

6.2M

Robert Tercek@Superplex·18 Eki

This is well worthwhile reading

Nathan Benaich@nathanbenaich

🪩The @stateofai 2023 is now here. Our 6th installment is one of the most exciting years I can remember. The #stateofai report covers everything you *need* to know, covering research, industry, safety and politics. There’s lots in there, so here’s my director’s cut 🧵

English

353

Robert Tercek retweetledi

Nathan Benaich@nathanbenaich·12 Eki

English

489

1.6K

971.1K

Robert Tercek retweetledi

TomLikesRobots🤖@TomLikesRobots·21 Tem

I'm absolutely blown away by @runwayml's #Gen2 using image input. The movement is so natural. Using it with @midjourney is a winning combination. If you want your video to stay true to your image, don't use a text prompt. (Thanks to @Uncanny_Harry and @Merzmensch for the tip!). This shows huge potential for creating #aicinema

Edinburgh, Scotland 🇬🇧 English

324

106.5K

Robert Tercek retweetledi

Voidz@voidzto·16 Oca

I think my reality is broken… #mixedrealityart #NFTCommunity #Web3 #nftarti̇st #NFTs #digitalartists

Toronto, Ontario 🇨🇦 English

172

21.7K

Robert Tercek retweetledi

Jim Fan@DrJimFan·30 Haz

Google is hosting the first "Machine Unlearning" challenge. Yes you heard it right - it's the art of forgetting, an emergent research field. GPT-4 lobotomy is a type of machine unlearning. OpenAI tried for months to remove abilities it deems unethical or harmful, sometimes going a bit too far. Unlike deleting data from disk, deleting knowledge from AI models (without crippling other abilities) is much harder than adding. But it is useful and sometimes necessary: ▸ Reduce toxic/biased/NSFW contents ▸ Comply with privacy, copyright, and regulatory laws ▸ Hand control back to content creators - people can request to remove their contribution to the dataset after a model is trained ▸ Update stale knowledge as new scientific discoveries arrive Check out the machine unlearning challenge: ai.googleblog.com/2023/06/announ…

English

514

2.3K

588.5K

Robert Tercek retweetledi

Andrej Karpathy@karpathy·30 Haz

I think this is mostly right. - LLMs created a whole new layer of abstraction and profession. - I've so far called this role "Prompt Engineer" but agree it is misleading. It's not just prompting alone, there's a lot of glue code/infra around it. Maybe "AI Engineer" is ~usable, though it takes something a bit too specific and makes it a bit too broad. - ML people train algorithms/networks, usually from scratch, usually at lower capability. - LLM training is becoming sufficently different from ML because of its systems-heavy workloads, and is also splitting off into a new kind of role, focused on very large scale training of transformers on supercomputers. - In numbers, there's probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers. - One can be quite successful in this role without ever training anything. - I don't fully follow the Software 1.0/2.0 framing. Software 3.0 (imo ~prompting LLMs) is amusing because prompts are human-designed "code", but in English, and interpreted by an LLM (itself now a Software 2.0 artifact). AI Engineers simultaneously program in all 3 paradigms. It's a bit 😵‍💫

swyx 🇸🇬 AIE Singapore!@swyx

🆕 Essay: The Rise of the AI Engineer latent.space/p/ai-engineer Keeping up on AI is becoming a full time job. Let's get together and define it.

English

140

707

4.1K

Robert Tercek@Superplex·14 Haz

I had a lively discussion with Jim Rutt about the WGA and copyright in the age of AI. Check it out!

Jim Rutt@jim_rutt

🎙️ w/ @Superplex on the writers' strike and IP in the era of generative AI. The history of Hollywood union negotiations, likely impacts on writers, the threat to influencers, why ChatGPT empowers writers in the near term, AI for education, & much more. jimruttshow.com/robert-tercek-…

English

814

Robert Tercek retweetledi

Drake Facts@NewsIn6ix·8 Haz

AI generated QR Code Art will be the next big thing. Heres whats possible. 1. Snowy Village

English

158

511

5.8K

1.4M

Robert Tercek retweetledi

Matt Wolfe@mreflow·7 Haz

We've seen text-to-image, text-to-3d object, and even text-to-video... Now check out text-to-3d character from @daz3d. Use natural language to create any character you can imagine in near-AAA game quality and then export that character directly into Blender, Unreal or Unity!

English

266

1.3K

287.1K

Robert Tercek retweetledi

Moritz Kremb@moritzkremb·7 Haz

You can easily create your own animated avatar. It takes less than 10 minutes. I'll show you how in 3 simple steps:

English

144

994

4.7K

1.5M

Robert Tercek retweetledi

fofr@fofrAI·5 May

🧵 A big #Midjourney thread on how to write prompts to get good cinematic images. In this thread I’ll build up a single prompt with cinematic elements, and show their effects. Each prompt will use a 16:9 aspect ratio, and to minimise variation I've locked in a seed.

English

414

2.6K

795.6K

Robert Tercek@Superplex·20 May

These examples don’t reveal anything that could plausibly “disrupt Hollywood” any time soon. But the progress is impressive and the trajectory is clear.

Nathan Lands@NathanLands

AI video has started to produce mindblowing results and could eventually disrupt Hollywood. (PT9) Here are the best AI videos I've found:

English

201

Robert Tercek@Superplex·8 May

Prompt tips for MJ 5.1. Enjoy

fofr@fofrAI

English

331

Robert Tercek@Superplex·8 May

Good info and charts here

Misha@mishadavinci

By 2025, ChatGPT will destabilize hundreds of millions of white-collar jobs. It could mean mass job loss for the highly educated. Here's what you need to know:

English

213

Robert Tercek retweetledi

Rowan Cheung@rowancheung·21 Nis

Another huge day in the world of AI with announcements from: Snapchat 'My AI' Synthesis AI Google Brain and Deepmind Martin Shkreli Here's a rundown on everything you need to know:

English

167

1.2K

904.3K

Robert Tercek retweetledi

Sully@SullyOmarr·20 Nis

Stability just released their new LLM. It's open-source, has 7bparameters, and its entirely free to use commercially. And its a MASSIVE deal, that has the potential to change up everything in AI Here's why:

English

288

1.9K

741.6K

Keşfet

@stateofai @runwayml @midjourney @Uncanny_Harry @Merzmensch @daz3d @elonmusk @BarackObama