Bohan Hou (@bohanhou1998) - Twitter Profili | Zamantika Mersobahis Locabet

Bohan Hou retweetledi

Tianqi Chen@tqchenml·21 Eki

📢Excited to introduce Apache TVM FFI, an open ABI and FFI for ML systems, enabling compilers, libraries, DSLs, and frameworks to naturally interop with each other. Ship one library across pytorch, jax, cupy etc and runnable across python, c++, rust tvm.apache.org/2025/10/21/tvm…

English

3

41

165

38.4K

Bohan Hou retweetledi

PyTorch@PyTorch·21 Eki

Live from the AI Infra Summit, co-located with #PyTorchCon — Tianqi Chen (@nvidia) explores how shared ML foundations can advance interoperability across compilers, libraries, DSLs, and frameworks, while unifying workloads across edge and cloud. 🔗 hubs.la/Q03PBnK00 #AIInfraSummit #OpenSourceAI #AIInfrastructure

English

2

13

46

8.3K

Bohan Hou retweetledi

Tim Dettmers@Tim_Dettmers·17 Nis

Happy to announce that I joined the CMU Catalyst with three of my incoming students. Our research will bring the best models to consumer GPUs with a focus on agent systems and MoEs. It is amazing to see so many talented people at Catalyst -- a very exciting ecosystem!

CMU School of Computer Science@SCSatCMU

Huge thank you to @NVIDIADC for gifting a brand new #NVIDIADGX B200 to CMU’s Catalyst Research Group! This AI supercomputing system will afford Catalyst the ability to run and test their work on a world-class unified AI platform.

English

13

48

340

24.3K

Bohan Hou retweetledi

Tianqi Chen@tqchenml·17 Nis

Really thrilled to receive #NVIDIADGX B200 from @nvidia . Looking forward to cooking with the beast. Together with an amazing team at CMU Catalyst group @BeidiChen @Tim_Dettmers @JiaZhihao @zicokolter, We are looking at the innovate across entire stack from model to instructions

CMU School of Computer Science@SCSatCMU

Huge thank you to @NVIDIADC for gifting a brand new #NVIDIADGX B200 to CMU’s Catalyst Research Group! This AI supercomputing system will afford Catalyst the ability to run and test their work on a world-class unified AI platform.

English

0

17

84

11.2K

Bohan Hou retweetledi

Zhihao Jia@JiaZhihao·17 Nis

Thank you to @NVIDIA for gifting our Catalyst Research Group the latest NVIDIA DGX B200! The B200 platform will greatly accelerate our research in building next-generation ML systems.🚀 #NVIDIADGX #DGXB200 @NVIDIADC

CMU School of Computer Science@SCSatCMU

Huge thank you to @NVIDIADC for gifting a brand new #NVIDIADGX B200 to CMU’s Catalyst Research Group! This AI supercomputing system will afford Catalyst the ability to run and test their work on a world-class unified AI platform.

English

0

10

51

8.1K

Bohan Hou@bohanhou1998·17 Nis

before use/in use/after use

CMU School of Computer Science@SCSatCMU

Huge thank you to @NVIDIADC for gifting a brand new #NVIDIADGX B200 to CMU’s Catalyst Research Group! This AI supercomputing system will afford Catalyst the ability to run and test their work on a world-class unified AI platform.

English

1

3

12

1.7K

Bohan Hou retweetledi

Hongyi Jin@HongyiJin258·7 Oca

🚀Making cross-engine LLM serving programmable. Introducing LLM Microserving: a new RISC-style approach to design LLM serving API at sub-request level. Scale LLM serving with programmable cross-engine serving patterns, all in a few lines of Python. blog.mlc.ai/2025/01/07/mic…

English

0

31

64

18.5K

Bohan Hou retweetledi

Ruihang Lai@ruihanglai·7 Haz

Announcing MLCEngine, a universal LLM deployment engine with ML Compilation. We rebuilt the engine with state-of-the-art serving optimizations and maximum local env portability. Fully OpenAI compatible for both cloud and local use cases. Check out the blog blog.mlc.ai/2024/06/07/uni…

English

3

15

44

13.5K

Bohan Hou retweetledi

Charlie Ruan@charlie_ruan·19 Nis

Llama 3 from @AIatMeta is now up on WebLLM! Try it on webllm.mlc.ai with local inference accelerated by @WebGPU. Or start building your local agent with the web-llm package -- everything in-browser!

English

2

12

77

23.5K

Bohan Hou retweetledi

Tianqi Chen@tqchenml·19 Nis

#Llama3 🦙🦙 running fully locally on iPad without internet connnection. credits to @ruihanglai and the team

English

0

15

73

7.8K

Bohan Hou retweetledi

Ruihang Lai@ruihanglai·19 Nis

Deploy #Llama3 locally with native GPU acceleration on CUDA/ROCm/Vulkan/Metal with MLC LLM. Check out llm.mlc.ai/docs/ for quick start instructions.

English

1

6

11

1.8K

Bohan Hou retweetledi

Mengshiun@mengshyu·19 Nis

Deploy #Llama3 on $100 Orange Pi with GPU acceleration through MLC LLM. Try it out on your Orange Pi 👉 blog.mlc.ai/2023/08/09/GPU…

English

1

12

54

18.9K

Bohan Hou retweetledi

Tianqi Chen@tqchenml·14 Mar

Please spread the words, #MLSys2024 will feature a full day single track-event young professional symposium with invited talks, panels, round tables, and poster sessions. Submit your 1-page abstract by April 1st & present your work at our poster session. sites.google.com/view/mlsys24yps

English

2

19

69

23K

Bohan Hou retweetledi

Mishaal Rahman@MishaalRahman·24 Şub

I asked @Google's Gemma 2B LLM to write me a poem. This is being run using the MLCChat app for Android on my Samsung Galaxy S24 Ultra.

English

5

16

228

18.6K

Bohan Hou retweetledi

Junru Shao@junrushao·19 Eki

(1/3) 🦙🌟 Looking to run Llama2-70B? With two NV/AMD GPUs or more? 💥🔥 Machine learning compilation (MLC) now supports multi-GPU. ⚡️💻 We achieve 34 tok/sec on 2 x RTX 4090, the fastest solution at $3.2k. 🌐💡Two AMD 7900XTX delivers 30 tok/sec at $2k. blog.mlc.ai/2023/10/19/Sca…

English

8

37

166

41.4K

Bohan Hou retweetledi

Junru Shao@junrushao·14 Ağu

While LLM is resource hungry and challenging to run at satisfactory speed on small devices, we show that ML compilation (MLC) techniques makes it possible to actually generate tokens at 5 tok/sec on a $100 Orange Pi with a Mali GPU. blog.mlc.ai/2023/08/09/GPU…

English

11

49

229

75.8K

Bohan Hou@bohanhou1998·9 Ağu

Making @AMD @amdradeon GPUs competitive for LLM inference! 130 toks/s of Llama 2 7B, 75 toks/s for 13B with ROCm 5.6 + 7900 XTX + 4 bit quantization 80% performance of Nvidia RTX 4090 See how we do this in detail and try out our Python packages here: blog.mlc.ai/2023/08/09/Mak…

English

9

39

184

77.2K

Bohan Hou@bohanhou1998·20 Tem

Now available in AppStore! apps.apple.com/us/app/mlc-cha…

Bohan Hou@bohanhou1998

#Llama2 is running on iPhone, iPad📱natively with GPU acceleration. No internet connection is required. See IOS instructions to get the test flight app now: mlc.ai/mlc-llm/docs/g…

English

0

5

6

3.1K

Bohan Hou retweetledi

Ruihang Lai@ruihanglai·20 Tem

Running Llama 2 directly in web browser with @WebGPU acceleration. Try it out at webllm.mlc.ai Build your own web app with Web LLM in 35 lines of code 👇, with npm package at @mlc-ai/web-llm" target="_blank" rel="nofollow noopener">npmjs.com/package/@mlc-a…

English

0

15

69

25.8K

Bohan Hou retweetledi

Zihao Ye@ye_combinator·20 Tem

MLC-LLM now supports deploying Llama-2-70B-chat locally (needs an Apple Silicon Mac w/ 50GB VRAM to run).🦙💬🔥 The decoding speed can achieve ~10.0 tokens/s on an M2 Ultra! Try it out at: mlc.ai/mlc-llm/docs/g… and join our discord server: discord.gg/9Xpy2HGBuD

GIF

Junru Shao@junrushao

(1/2) 🦙 Buckle up and ready for a wild llama ride with 70B Llama-2 on a single MacBook 💻 🤯 Now 70B Llama-2 can be run smoothly on an 64G M2 max with 4bit quantization. 👉 Here is a step-by-step guide: mlc.ai/mlc-llm/docs/g… 🚀 How about the performance? It's

English

0

10

32

6.5K

Bohan Hou

Keşfet