Alfonso² Peterssen

119 posts

Alfonso² Peterssen

@TheMukel

"JVM within a JVM by day, LLMs on the JVM by night." @qxoticai

Zurich, Switzerland Katılım Ağustos 2015

572 Takip Edilen443 Takipçiler

Alfonso² Peterssen@TheMukel·14h

@__tinygrad__ The operators serve hyper-specialized implementations of each model. How good is tinygrad at fusing high-level ops? Even with some advanced compiler magic, the hand-tuned kernels with nit-picked fusions are hard to beat. It's a pristine model blueprint vs. a tuned Franken-model.

English

the tiny corp@__tinygrad__·1d

Branching off of this is all the fun stuff. Megakernels, fast GGUF on the fly unpacker, function decorator for Python speed, KV cache swap to disk. This should stay ~500 lines, but outperform all the BS=1 LLM runners through the power of tinygrad. Kimi at 500 tok/s on MI350X?

English

5.3K

the tiny corp@__tinygrad__·1d

We are looking to hire someone to improve our LLM runner, with USB GPU + high BS=1 tok/s it should be used a lot soon. The TODO list is in the Discord, but no bounties since that yields AI slop. The bottleneck today isn't writing code, it's filtering it. Show me you can do that.

English

247

19.2K

Alfonso² Peterssen retweetledi

Аlina Yurenko 🇺🇦@alina_yurenko·8 Eki

My @GraalVM Native Image deep dive recording is already up: youtube.com/watch?v=1J6mbM… 🐰🚀 It includes the very public first demo of project Crema, Open World for Native Image, at 2:19:54 😅 Thank you, @Devoxx! All demos and notes are here: github.com/alina-yur/graa…

YouTube

English

3.7K

Alfonso² Peterssen retweetledi

Michalis Papadimitriou@mikepapadim·30 May

GPULlama3.java is out! Great effort by the @tornadovm team to bring GPU-enabled inference to the JVM

Mary Xekalaki@MXekalaki

We are proud to release the first fully JITed, open-source GPU-accelerated Llama3 inference in pure Java powered by #TornadoVM 🚀 🎯 NVIDIA GPUs using PTX and OpenCL backend 👉 github.com/beehive-lab/GP… We are looking forward to your feedback! #opensource #Java #AI #LLM #GPUs

English

566

Alfonso² Peterssen retweetledi

Josh Long@starbuxman·16 Nis

run an LLM with a supercharged engine powered by Java and GraalVM (ht @alina_yurenko ) youtube.com/shorts/7zSEa_B…

YouTube

English

10.5K

Alfonso² Peterssen retweetledi

Stephan Janssen ☕️🧠🧞‍♂️@Stephan007·14 Oca

Looking forward to speaking tomorrow at @VoxxedCERN together with @TheMukel followed by delivering both a keynote and a regular talk at @VoxxedTicino on Friday! 🤩🎤#FunDaysAhead #AAP #Java

English

846

Alfonso² Peterssen retweetledi

Fabio Niephaus@fniephaus·20 Ara

We just merged the current status of the upcoming JDWP support for @GraalVM Native Image! 🥳 This will soon provide developers with the same debugging experience they are used to in Java, but for native images! Stay tuned for more details. github.com/oracle/graal/p…

English

7.4K

Alfonso² Peterssen@TheMukel·16 Kas

buff.ly/40KmT0t Graal compiler: +10% faster inference with the latest early access build. New features: batched prompt processing & AVX512 support.

English

7.1K

Alfonso² Peterssen retweetledi

Stephan Janssen ☕️🧠🧞‍♂️@Stephan007·20 Eki

As a result I can now use @DevoxxGenie with a pure Java Arm Inference engine running locally on my mac using Llama 3.2 🍏🔥

English

813

Alfonso² Peterssen retweetledi

Stephan Janssen ☕️🧠🧞‍♂️@Stephan007·20 Eki

Modern @Java Project : a Spring Boot wrapper for Llama3.java from @TheMukel supporting OpenAI Chat Completion REST requests 🔥 github.com/stephanj/Llama… #OpenAI #SpringBoot

English

12.1K

Alfonso² Peterssen retweetledi

Johan Hutting@JohanHutting·17 Eki

Earlier today was asked if Java AI integration improved yet, or that we'd still need to rely on Python or C bindings. Was happy to share github.com/mukel/llama3.j… by @TheMukel from the GraalVM team running native in Java without any dependencies and with superior performance!

English

7.2K

Alfonso² Peterssen retweetledi

Jörg Wille@FilterPunk·17 Eki

@tjake For me your #Devoxx talk about github.com/tjake/Jlama and the one from @TheMukel about github.com/mukel/llama3.j… were the most relevant talks. Thanks for all the background information - I have learned a lot!

English

146

Alfonso² Peterssen retweetledi

Stephan Janssen ☕️🧠🧞‍♂️@Stephan007·14 Eki

Just made the first-ever @DevoxxGenie LLM inference using ONLY @Java, powered by the awesome #Jlama project! ☕🔥 Huge thanks to @tjake for making it happen! 💪🏼#NoPython #JavaAI

English

2.5K

Alfonso² Peterssen@TheMukel·13 Eki

@christzolov @vitalethomas @alina_yurenko I have a working prototype with function calling via LangChain4j. Vision is just a matter of implementing an additional component, the rest of the inference remains the same. I'll do my best to implement the missing encoder for vision soon-ish, starting with Llama, then Qwen.

English

126

Christian Tzolov🇧🇬🇪🇺🇺🇦 🦋@tzolov.bsky.social@christzolov·13 Eki

@vitalethomas @TheMukel @alina_yurenko I like the idea, but last time I've checked this interface, it didn't support elaborate configuration, function calling nor the multimodality provided in Llama3.2. Has this changed or is it easy/possible to add support for it?

English

127

Alfonso² Peterssen@TheMukel·12 Eki

@diegoasua @bate5a55 @julien_c Why not? It runs Llama 3.2 1B at 40+ tokens/s on my laptop. It also supports GraalVM's Native Image with instant time-to-first-token.

English

197

Diego@diegoasua·10 Eki

@bate5a55 @julien_c you don't run a +1B model on CPU. Good luck with that, that's like tying a freight to a donkey. Will it move? Maybe. But also, don't do that

English

716

Julien Chaumond@julien_c·10 Eki

…but why? 😀

English

152

25.8K

Alfonso² Peterssen@TheMukel·12 Eki

All details were presented at Devoxx Belgium! youtu.be/zgAMxC7lzkc PS: I'm the original author.

YouTube

Julien Chaumond@julien_c

…but why? 😀

English

4.4K

Alfonso² Peterssen retweetledi

Аlina Yurenko 🇺🇦@alina_yurenko·12 Eki

Developing and optimizing an LLM inference engine in pure Java: youtube.com/watch?v=zgAMxC… with @TheMukel #Devoxx #Java #LLM

YouTube

English

1.2K

Alfonso² Peterssen retweetledi

Stephan Janssen ☕️🧠🧞‍♂️@Stephan007·11 Eki

Here's the #Devoxx Belgium recording, enjoy! youtube.com/watch?v=zgAMxC…

YouTube

English

1.5K

Alfonso² Peterssen retweetledi

Stephan Janssen ☕️🧠🧞‍♂️@Stephan007·10 Eki

Contribute! 👇

English

724

Alfonso² Peterssen retweetledi

Adam Bien@AdamBien·10 Eki

jlama (by @tjake ) and llama3.java (by @TheMukel ) examples are working. See you at: "Pure Java Enterprise AI/LLM Integration (EAI 2.0)" @Devoxx (devoxx.be/talk/pure-java…) #java #llm #ai

English

1.3K

Alfonso² Peterssen retweetledi

Fabio Niephaus@fniephaus·10 Eki

.@TheMukel and @alina_yurenko talking about practical LLM inference in modern Java with Llama3.java and @GraalVM at @Devoxx! 🏎🎉 Try it out at github.com/mukel/llama3.j…

English

1.4K

Keşfet

@__tinygrad__ @graalvm @Devoxx @tornadovm @alina_yurenko @VoxxedCERN @VoxxedTicino @GraalVM