Robert Tercek

5.4K posts

Robert Tercek banner
Robert Tercek

Robert Tercek

@Superplex

I'm the author of Vaporized. Spent 20 years inventing the future of games, TV, education, mobile. Now I help companies make the transition to digital domain.

Los Angeles and the world Katılım Mart 2009
686 Takip Edilen3.4K Takipçiler
Robert Tercek retweetledi
Bilawal Sidhu
Bilawal Sidhu@bilawalsidhu·
I’ve used the Apple Vision Pro for 2 weeks now and here are my unfiltered thoughts — you might even call it a hot take 🌶️ 😅 Overall: I'm blown away, absolutely hyped... but also? Frustrated. Why is Apple making it SO HARD to tap into the existing VR media scene? There is a plethora of VR 180 and 360 content out there. And they've got this sweet immersive video player tucked away in Apple TV, with that signature Apple polish. Great I can watch an Alicia Keys video. YouTubeVR has millions of more videos I can’t currently watch. So I try to drop manually converted spatial videos directly into the Mac Photos app – nope! Gotta jump through iCloud hoops to retain the metadata? It's ridiculous. Almost feels like Apple wants to gate keep ALL immersive content on the headset. Ok fine but you can shoot stuff on iPhone spatial videos or on the headset itself. Awesome, but to edit it you need to use a 3rd party converter to turn it back into well adopted VR media formats to edit in Adobe or Resolve lol. There are killer ARKit apps, it might’ve made sense to get some of those ported over! Missed layup. And don’t even get me started on the broken webXR support. 3D websites like Luma AI and Polycam should’ve been immersive 3d on day 1. It feels like they just shipped an unfinished product. Which might be the case, considering how much of that mind-blowing WWDC 2023 stuff is STILL missing. Remember those awesome shared SharePlay experiences they demoed? Poof, gone. Even simple stuff, like shared spatial anchors that ARKit already supports for multiplayer AR, is nowhere to be found. I mean they can pass through reality with imperceptible latency, but I can’t have shared experiences with users who also have Apple Vision Pro in the room, and I am relegated to iPads on screens? Not much better than zoom and a massive underutilization of what this hardware is capable of in terms of realistic co-presence and remote collaboration. Look, maybe this is all growing pains. Maybe they shipped with 20% of the roadmap ready, and we'll just have to wait. But right now, the Apple Vision Pro feels like this super shiny walled garden in the middle of a sprawling VR playground. Here's hoping they open the gates soon... 🔐 #AppleVisionPro #MetaQuest3 #frustratedfanboy
Bilawal Sidhu tweet media
English
82
30
370
90.2K
Robert Tercek retweetledi
Jim Fan
Jim Fan@DrJimFan·
Apparently some folks don't get "data-driven physics engine", so let me clarify. Sora is an end-to-end, diffusion transformer model. It inputs text/image and outputs video pixels directly. Sora learns a physics engine implicitly in the neural parameters by gradient descent through massive amounts of videos. Sora is a learnable simulator, or "world model". Of course it does not call UE5 explicitly in the loop, but it's possible that UE5-generated (text, video) pairs are added as synthetic data to the training set.
English
61
73
754
191.6K
Robert Tercek retweetledi
Jim Fan
Jim Fan@DrJimFan·
I see some vocal objections: "Sora is not learning physics, it's just manipulating pixels in 2D". I respectfully disagree with this reductionist view. It's similar to saying "GPT-4 doesn't learn coding, it's just sampling strings". Well, what transformers do is just manipulating a sequence of integers (token IDs). What neural networks do is just manipulating floating numbers. That's not the right argument. Sora's soft physics simulation is an *emergent property* as you scale up text2video training massively. - GPT-4 must learn some form of syntax, semantics, and data structures internally in order to generate executable Python code. GPT-4 does not store Python syntax trees explicitly. - Very similarly, Sora must learn some *implicit* forms of text-to-3D, 3D transformations, ray-traced rendering, and physical rules in order to model the video pixels as accurately as possible. It has to learn concepts of a game engine to satisfy the objective. - If we don't consider interactions, UE5 is a (very sophisticated) process that generates video pixels. Sora is also a process that generates video pixels, but based on end-to-end transformers. They are on the same level of abstraction. - The difference is that UE5 is hand-crafted and precise, but Sora is purely learned through data and "intuitive". Will Sora replace game engine devs? Absolutely not. Its emergent physics understanding is fragile and far from perfect. It still heavily hallucinates things that are incompatible with our physical common sense. It does not yet have a good grasp of object interactions - see the uncanny mistake in the video below. Sora is the GPT-3 moment. Back in 2020, GPT-3 was a pretty bad model that required heavy prompt engineering and babysitting. But it was the first compelling demonstration of in-context learning as an emergent property. Don't fixate on the imperfections of GPT-3. Think about extrapolations to GPT-4 in the near future.
English
235
436
2.7K
990.5K
Robert Tercek retweetledi
Jim Fan
Jim Fan@DrJimFan·
If you think OpenAI Sora is a creative toy like DALLE, ... think again. Sora is a data-driven physics engine. It is a simulation of many worlds, real or fantastical. The simulator learns intricate rendering, "intuitive" physics, long-horizon reasoning, and semantic grounding, all by some denoising and gradient maths. I won't be surprised if Sora is trained on lots of synthetic data using Unreal Engine 5. It has to be! Let's breakdown the following video. Prompt: "Photorealistic closeup video of two pirate ships battling each other as they sail inside a cup of coffee." - The simulator instantiates two exquisite 3D assets: pirate ships with different decorations. Sora has to solve text-to-3D implicitly in its latent space. - The 3D objects are consistently animated as they sail and avoid each other's paths. - Fluid dynamics of the coffee, even the foams that form around the ships. Fluid simulation is an entire sub-field of computer graphics, which traditionally requires very complex algorithms and equations. - Photorealism, almost like rendering with raytracing. - The simulator takes into account the small size of the cup compared to oceans, and applies tilt-shift photography to give a "minuscule" vibe. - The semantics of the scene does not exist in the real world, but the engine still implements the correct physical rules that we expect. Next up: add more modalities and conditioning, then we have a full data-driven UE that will replace all the hand-engineered graphics pipelines. openai.com/sora
English
551
2.6K
13K
6.2M
Robert Tercek
Robert Tercek@Superplex·
This is well worthwhile reading
Nathan Benaich@nathanbenaich

🪩The @stateofai 2023 is now here. Our 6th installment is one of the most exciting years I can remember. The #stateofai report covers everything you *need* to know, covering research, industry, safety and politics. There’s lots in there, so here’s my director’s cut 🧵

English
0
0
3
345
Robert Tercek retweetledi
Nathan Benaich
Nathan Benaich@nathanbenaich·
🪩The @stateofai 2023 is now here. Our 6th installment is one of the most exciting years I can remember. The #stateofai report covers everything you *need* to know, covering research, industry, safety and politics. There’s lots in there, so here’s my director’s cut 🧵
Nathan Benaich tweet media
English
57
491
1.6K
970.7K
Robert Tercek retweetledi
TomLikesRobots🤖
TomLikesRobots🤖@TomLikesRobots·
I'm absolutely blown away by @runwayml's #Gen2 using image input. The movement is so natural. Using it with @midjourney is a winning combination. If you want your video to stay true to your image, don't use a text prompt. (Thanks to @Uncanny_Harry and @Merzmensch for the tip!). This shows huge potential for creating #aicinema
Edinburgh, Scotland 🇬🇧 English
31
43
324
106.5K
Robert Tercek retweetledi
Jim Fan
Jim Fan@DrJimFan·
Google is hosting the first "Machine Unlearning" challenge. Yes you heard it right - it's the art of forgetting, an emergent research field. GPT-4 lobotomy is a type of machine unlearning. OpenAI tried for months to remove abilities it deems unethical or harmful, sometimes going a bit too far. Unlike deleting data from disk, deleting knowledge from AI models (without crippling other abilities) is much harder than adding. But it is useful and sometimes necessary: ▸ Reduce toxic/biased/NSFW contents ▸ Comply with privacy, copyright, and regulatory laws ▸ Hand control back to content creators - people can request to remove their contribution to the dataset after a model is trained ▸ Update stale knowledge as new scientific discoveries arrive Check out the machine unlearning challenge: ai.googleblog.com/2023/06/announ…
Jim Fan tweet media
English
93
517
2.3K
588.5K
Robert Tercek retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
I think this is mostly right. - LLMs created a whole new layer of abstraction and profession. - I've so far called this role "Prompt Engineer" but agree it is misleading. It's not just prompting alone, there's a lot of glue code/infra around it. Maybe "AI Engineer" is ~usable, though it takes something a bit too specific and makes it a bit too broad. - ML people train algorithms/networks, usually from scratch, usually at lower capability. - LLM training is becoming sufficently different from ML because of its systems-heavy workloads, and is also splitting off into a new kind of role, focused on very large scale training of transformers on supercomputers. - In numbers, there's probably going to be significantly more AI Engineers than there are ML engineers / LLM engineers. - One can be quite successful in this role without ever training anything. - I don't fully follow the Software 1.0/2.0 framing. Software 3.0 (imo ~prompting LLMs) is amusing because prompts are human-designed "code", but in English, and interpreted by an LLM (itself now a Software 2.0 artifact). AI Engineers simultaneously program in all 3 paradigms. It's a bit 😵‍💫
swyx@swyx

🆕 Essay: The Rise of the AI Engineer latent.space/p/ai-engineer Keeping up on AI is becoming a full time job. Let's get together and define it.

English
142
710
4.1K
2M
Robert Tercek
Robert Tercek@Superplex·
I had a lively discussion with Jim Rutt about the WGA and copyright in the age of AI. Check it out!
Jim Rutt@jim_rutt

🎙️ w/ @Superplex on the writers' strike and IP in the era of generative AI. The history of Hollywood union negotiations, likely impacts on writers, the threat to influencers, why ChatGPT empowers writers in the near term, AI for education, & much more. jimruttshow.com/robert-tercek-…

English
0
1
1
813
Robert Tercek retweetledi
Drake Facts
Drake Facts@NewsIn6ix·
AI generated QR Code Art will be the next big thing. Heres whats possible. 1. Snowy Village
Drake Facts tweet media
English
159
512
5.8K
1.4M
Robert Tercek retweetledi
Matt Wolfe
Matt Wolfe@mreflow·
We've seen text-to-image, text-to-3d object, and even text-to-video... Now check out text-to-3d character from @daz3d. Use natural language to create any character you can imagine in near-AAA game quality and then export that character directly into Blender, Unreal or Unity!
English
74
269
1.3K
287K
Robert Tercek retweetledi
Moritz Kremb
Moritz Kremb@moritzkremb·
You can easily create your own animated avatar. It takes less than 10 minutes. I'll show you how in 3 simple steps:
English
144
1K
4.7K
1.5M
Robert Tercek retweetledi
fofr
fofr@fofrAI·
🧵 A big #Midjourney thread on how to write prompts to get good cinematic images. In this thread I’ll build up a single prompt with cinematic elements, and show their effects. Each prompt will use a 16:9 aspect ratio, and to minimise variation I've locked in a seed.
fofr tweet mediafofr tweet mediafofr tweet mediafofr tweet media
English
93
417
2.6K
795.4K
Robert Tercek
Robert Tercek@Superplex·
Prompt tips for MJ 5.1. Enjoy
fofr@fofrAI

🧵 A big #Midjourney thread on how to write prompts to get good cinematic images. In this thread I’ll build up a single prompt with cinematic elements, and show their effects. Each prompt will use a 16:9 aspect ratio, and to minimise variation I've locked in a seed.

English
1
0
2
331
Robert Tercek retweetledi
Rowan Cheung
Rowan Cheung@rowancheung·
Another huge day in the world of AI with announcements from: Snapchat 'My AI' Synthesis AI Google Brain and Deepmind Martin Shkreli Here's a rundown on everything you need to know:
English
34
171
1.2K
904.3K
Robert Tercek retweetledi
Sully
Sully@SullyOmarr·
Stability just released their new LLM. It's open-source, has 7bparameters, and its entirely free to use commercially. And its a MASSIVE deal, that has the potential to change up everything in AI Here's why:
Sully tweet media
English
51
288
1.9K
741.6K