Jae Lee

45 posts

Jae Lee banner
Jae Lee

Jae Lee

@_jae_lee

Founder & CEO @twelve_labs. @UCBerkeley, @techstars alum. I build large video-understanding neural nets with the smartest folks :)

San Francisco, CA Katılım Aralık 2020
235 Takip Edilen260 Takipçiler
Sabitlenmiş Tweet
Jae Lee
Jae Lee@_jae_lee·
Congratulations @twelve_labs, @seo_minjoon and @ai_den_lee for the achievement! Folks, check out the tech report on Pegasus-1 17B. We ask some really interesting questions to the model and the results are not too shabby 🙏❤️ More to come!
TwelveLabs (twelvelabs.io)@twelve_labs

🚀 We're excited to share the technical report of Pegasus-1, our 17B-parameter VLM, setting new benchmarks in video understanding. It surpasses larger models like Gemini Pro and Ultra in video conversation, QA, summarization, and temporal understanding. bit.ly/pegasus-1-tech…

English
0
1
15
1.8K
Jae Lee
Jae Lee@_jae_lee·
Thanks for having me @lucktm and @NEA. Had so much fun jamming! 😄
NEA@NEA

🗣️ “Whenever I think about meeting founders, one of my questions I ask is, is this a problem they're obsessed with?” says NEA Partner @lucktm. “And when I met Twelve Labs, 100%, that came through.” @twelve_labs co-founder @_jae_lee joins Tiffany as part of our Founder Forward interview series, where NEA speaks with the leaders of the startups we’ve partnered with about the technology and market trends driving their businesses. Check out the full conversation below. 👇🏽 nea.com/blog/twelve-la… #ai #llm #data

English
1
0
3
534
Jae Lee
Jae Lee@_jae_lee·
@aidangomez 다음에는 함께 막걸리 마셔요~ 🍻
한국어
0
0
1
77
Aidan Gomez
Aidan Gomez@aidangomez·
지난주 서울에서 정말 좋은 시간을 보내서 다시 방문할 생각에 너무 설레요. 토론토에서 서울로 이사 온 친구들과 막걸리와 소맥을 많이 마시며 여행을 마무리했어요.
한국어
2
0
26
5K
Jae Lee
Jae Lee@_jae_lee·
@Ji_Ha_Kim All good! More models the better ❤️
English
0
0
2
18
Jae Lee
Jae Lee@_jae_lee·
100%. Model’s generative capabilities do not necessarily equate to strong perceptual reasoning and understanding.
Yann LeCun@ylecun

Let me clear a *huge* misunderstanding here. The generation of mostly realistic-looking videos from prompts *does not* indicate that a system understands the physical world. Generation is very different from causal prediction from a world model. The space of plausible videos is very large, and a video generation system merely needs to produce *one* sample to succeed. The space of plausible continuations of a real video is *much* smaller, and generating a representative chunk of those is a much harder task, particularly when conditioned on an action. Furthermore, generating those continuations would be not only expensive but totally pointless. It's much more desirable to generate *abstract representations* of those continuations that eliminate details in the scene that are irrelevant to any action we might want to take. That is the whole point behind the JEPA (Joint Embedding Predictive Architecture), which is *not generative* and makes predictions in representation space. Our work on VICReg, I-JEPA, V-JEPA, and the works of others show that Joint Embedding architectures produce much better representations of visual inputs than generative architectures that reconstruct pixels (such as Variational AE, Masked AE, Denoising AE, etc). When using the learned representations as inputs to a supervised head trained on downstream tasks (without fine tuning the backbone), Joint Embedding beats generative. See the results table from the V-JEPA blog post or paper: ai.meta.com/blog/v-jepa-ya…

English
0
1
6
619
TwelveLabs (twelvelabs.io)
TwelveLabs (twelvelabs.io)@twelve_labs·
Next Wednesday, our Chief Scientist @seo_minjoon will join a fireside chat organized by @CohereForAI. Register below and tune in for his perspectives on the evolving LLM landscape and the rise of Video-Language modeling! tinyurl.com/C4AIMinjoonSeo
Cohere Labs@Cohere_Labs

Don't forget to join us next week on Weds., 11/8 as we kick off the Cohere For AI Fireside Chat series. Our speaker, @seo_minjoon - Assistant Professor at KAIST & Chief Scientist of Twelve Labs - will sit down with @beyzaermis for a discussion on "Ever-evolving Language Models."

English
1
2
9
5.1K
shai 🌻
shai 🌻@shaiunterslak·
We just unf*cked the internet. Upload any video file, and instantly remove all inappropriate audio content. This is NOT a find and replace for swear words, this uses a video understanding model to detect inappropriate content (using @twelve_labs ofc) Example in the thread:
English
6
0
32
3.4K
Jae Lee retweetledi
TwelveLabs (twelvelabs.io)
TwelveLabs (twelvelabs.io)@twelve_labs·
Our co-founder @soyoungacorn gave a talk at the #SFAIMeetup event we sponsored earlier this month. Give it a watch if you are curious to learn about our journey from the incubation at 🇰🇷cyber command!
Roger@OkGoDoIt

Thanks again to @twelve_labs for sponsoring my #SFAIMeetup on May 4th. Co-founder Soyoung Lee's talk "Making video as easy as text through multimodal foundation models" was an awesome journey from idea to successful startup. youtu.be/GtLBHtTlOEw @soyoungacorn #ai #ml

English
0
1
4
453
Jae Lee retweetledi
TwelveLabs (twelvelabs.io)
TwelveLabs (twelvelabs.io)@twelve_labs·
Our latest article from @le_james94 reviews how far video understanding research has come, what potential remains untapped, and where it is headed in the future.
TwelveLabs (twelvelabs.io) tweet media
English
1
2
5
676