Bai Li

152 posts

Bai Li

@libai_94

ML Engineer & PhD in NLP

Vancouver, Canada Katılım Ağustos 2009

184 Takip Edilen157 Takipçiler

Bai Li@libai_94·15 Ağu

@basvanopheusden Thanks for sharing!

English

basvanopheusden@basvanopheusden·15 Ağu

My friend @libai_94 runs a channel efficient NLP with some cool tutorials! Highly recommend checking it out ;) @EfficientNLP" target="_blank" rel="nofollow noopener">youtube.com/@EfficientNLP

English

2.1K

Bai Li@libai_94·16 Eyl

Voice Writer is live on Product Hunt 👍🔼 Check it out and show your support by giving an upvote if you find it useful! producthunt.com/posts/voice-wr…

English

132

Bai Li@libai_94·25 Haz

@joshalbrecht Great work team and kudos to the very fast progress! 👏

English

Josh Albrecht@joshalbrecht·25 Haz

Today we're releasing: - Cleaned up (and extended) versions of 11 public NLP benchmarks - An open source method for automatically discovering scaling laws - A guide to bringing up a 4000 GPU cluster from bare metal - ...and more, see below!

Imbue@imbue_ai

Early this year, we trained a 70B model optimized for reasoning and coding. This model roughly matches LLAMA 3 70B despite being trained on 7x less data. Today, we’re releasing a toolkit to help others do the same, including: • 11 sanitized and extended NLP reasoning benchmarks including ARC, GSM8K, HellaSwag, and Social IQa • An original code-focused reasoning benchmark • A new dataset of 450,000 human judgments about ambiguity in NLP questions • A hyperparameter optimizer for scaling small experiments to a 70B run • Infrastructure scripts for bringing a cluster from bare metal to robust high-utilization training …and more! Read more and access the toolkit here: imbue.com/research/70b-i…

English

1.8K

Bai Li@libai_94·9 May

@isabelpapad @UBCLinguistics @KempnerInst Welcome to Vancouver! I’d love to catch up sometime once you’re here!

English

111

Isabel Papadimitriou@isabelpapad·9 May

Really really excited to be joining @UBCLinguistics! I'm so happy to get to work with the lovely people in the department I'll be going to @KempnerInst in the interim, again lucky to work with lovely, interdisciplinary people I'd love to hang if you're in Boston or Vancouver!

UBC Linguistics@UBCLinguistics

We are thrilled that Isabel Papadimitriou (@isabelpapad) will be joining @UBCLinguistics as an Assistant Professor as of Sept 2025!

English

171

26.6K

Bai Li@libai_94·8 Nis

Try out Voice Writer today: efficientnlp.com/voice-writer

English

119

Bai Li@libai_94·8 Nis

Now, with voice technology and and AI for grammar correction, I find I can produce 3-4x more content without worrying about minor language issues — between 2,000 to 3,000 words! 🚀This has allowed me to express my ideas more fully and creatively.

English

151

Bai Li@libai_94·8 Nis

I've been using Voice Writer for my blog posts, and I am much more productive. 🌟 Before, my posts averaged around 750 words.

English

120

Bai Li@libai_94·26 Mar

I built a voice writer tool to help you write things quickly. ⚡️ It uses AI for speech recognition and grammar correction. I have been using it for my book reviews, emails, Slack messages, and more. Here is a demo video. 😊 Try it out here: efficientnlp.com/voice-writer

English

144

Bai Li@libai_94·2 Mar

In this video, I cover the top 10 most cited papers in the history of natural language processing, ranked by number of Google Scholar citations. 📚 We cover milestones like the Transformer model, RNN, word vectors, and even go back to the roots with WordNet!

English

Bai Li@libai_94·2 Mar

New NLP video 🎥 youtube.com/watch?v=qQ9dF4…

YouTube

English

104

Bai Li@libai_94·21 Oca

Challenges we faced: - Teochew is related to Mandarin, a high-resource language, but how do we apply transfer learning? - With zero resources for training, we had to build our dataset from scratch. 🛠️ - Teochew doesn't even have a writing system! How do we model that? 🤔

English

Bai Li@libai_94·21 Oca

🎥 New Video! In this video, we train a speech recognition model (using OpenAI's Whisper) to recognize our family's Chinese dialect, Teochew, or Chaozhou dialect (潮州话). It has about 10 million speakers and is a part of the Min Nan language family. youtube.com/watch?v=JH_78K…

YouTube

English

196

Bai Li@libai_94·6 Kas

More seriously - we'll use the RAG pattern, indexing HuggingFace metadata, integrating OpenAI embeddings with pgvector and chat models. I'll also explain some tips on how to rerank the chatbot's suggestions, deploy the project efficiently, and more.

English

127

Bai Li@libai_94·6 Kas

📹 New Video! Ever had trouble deciding which AI to use for your projects? Let's solve that with AI. In this video, I will build an AI to find the best AI for you 🤯 youtu.be/2r-SqtxhgmY

YouTube

English

140

Bai Li@libai_94·22 Eki

@osanseviero I've made a video about the KV cache, and I've also got videos on other LLM topics like RoPE embeddings, speculative sampling, quantization, etc. If you like learning through colorful animated videos, check them out! youtube.com/watch?v=80bIUg…

YouTube

English

553

Omar Sanseviero@osanseviero·22 Eki

Which are the best resources to learn about LLMs topics such as KV cache for attention, parallelism techniques, MQA and GQA, GPTQ, AWQ, RoPE, and so on?

English

254

90K

Bai Li@libai_94·13 Eki

It's a new technique called speculative sampling. A smaller LLM generates the easier tokens and a larger LLM checks them. And using a rejection sampling trick, there is no difference in accuracy! Check out my video on how this works ➡️youtube.com/watch?v=S-8yr_…

YouTube

English

113

Bai Li@libai_94·13 Eki

📹 New Video! #LLMs can be slow, so this @GoogleDeepMind paper proposed to speed it up by running two LLMs at the same time. 😕 Wait what?

English

123

Bai Li@libai_94·30 Ağu

Just published a comprehensive video highlighting EVERY area of Natural Language Processing research, in 24 categories. From Phonology to Translation to Summarization to LLMs, explore all of of NLP in 30 minutes!

English

Bai Li@libai_94·30 Ağu

📺 New video: Map of Natural Language Processing youtube.com/watch?v=zF69Eq…

YouTube

English

839

Bai Li@libai_94·11 Ağu

@sherwinwu ChatGPT clearly states that sensitive information should not be fed to it. My question to you is: if OpenAI does not train on user data, then why is this warning in place?

English

603

Sherwin Wu@sherwinwu·11 Ağu

I chatted with some customers of our API today and was surprised that they didn’t know the answer to this. So here’s an experiment. Does OpenAI train on the data that you send in through the API?

English

98.2K

Keşfet

@basvanopheusden @joshalbrecht @isabelpapad @UBCLinguistics @KempnerInst @osanseviero @elonmusk @BarackObama