Zippy รีทวีตแล้ว
Zippy
101 posts

Zippy
@AlexanderRedde3
Software engineer working on ML image generation techniques & inference. Obsessed with programming and life.
Here and there เข้าร่วม Mart 2022
120 กำลังติดตาม566 ผู้ติดตาม

@hisham_artz You can use the code in the official trt repo to build a trt engine, and run it from there. Mine is different- took the Engine class from repo & modified it to behave like a diffusers unet so i can run it in diffusers pipeline. It's under /demo/Diffusion/ github.com/NVIDIA/TensorRT
English

@AlexanderRedde3 I'm trying to combine SDXL with TensorRT, do you have a github i can follow? I'd like to recreate this
English

Made an fp8 implementation of Flux which gets ~3.5 it/s 1024x1024 on 4090, >2x faster than other methods. github.com/aredden/flux-f… #flux1
English

@oleg__chomp I think you could probably just compile it normally- following the examples from the tensorrt repo. Most of the speedups from tensorrt are a result of it's own internal autotuning. The api for using that autotuning remains the same for most models.
English

@AlexanderRedde3 great result! maybe you can share some tips on connecting taesd to tensorrt pipeline?
English

@vibeke_udart Also make sure that the scheduler is using "timestep_spacing": "trailing". And depending on the actual trt impl you are using. If you're just using the code in the trt repo as I did, you'll want to make a diffusers compatible unet wrapper which can be used in normal diffusers.
English

@vibeke_udart Well sdxl-turbo should use a maximum of guidance scale ~ 1.5, looks like it is using more. I've also found that sometimes the diffusers LCMScheduler works best for details, though can make the image a bit smudgey. Either LCMScheduler or EulerAncestralDiscreteScheduler.
English

@rudzinskimaciej Yes I am 😊 I have a serious case of programming addiction haha
English

@AlexanderRedde3 Then double wow for the effect 😁 thx
You are doing it for yourself?
English

Finally figured out how to speed up my #sdxlturbo
frontend! It's so fast that the only way to show the actual speed is to delete the prompt, since I can't type fast enough 😆 .. built with next.js frontend & tensorrt backend.
English

@koltregaskes Essentially, yes. Though the lack of inter-frame coherency even with the same seed would make it seem very jittery, similar to the video.
English

@AlexanderRedde3 If you could type fast enough you could create animation/video on the go? 🤔
English

@rudzinskimaciej It's an amd 7950x cpu + 2x4090 machine, though this demo is only using 1 4090. Also thanks! 😊
English

@AlexanderRedde3 What's the machine it's doing inference on?
Great work 💪
English

@ZealotDKD Right now there isn't a way. It's just a web ui / api running on my workstation. 🥹
English

@AlexanderRedde3 How can I use this?
English

@vibeke_udart There's a demo on their github github.com/NVIDIA/TensorRT. It uses diffusers, so can be pulled automatically though huggingface, though getting tensorrt to the state it's in-in my demo requires a lot of tweaks. Also I recommend using docker, tensorrt can be kina finnicky.
English

@JonathanSolder3 Not at the moment, it's just a little app running on my workstation at home. Not sure when or if it'll ever be a part of a public thing.
English

Made a neat GUI in react for fun & hooked it up to an optimized sdxl-turbo tensorrt api backend I built for image autocomplete! It generates so fast that my browser cant keep up 🥲 - so fun! #sdxlturbo #sdxl #AIart #stablediffusion
English

@EMostaque Your prediction of real time image generation came true 😄. It took a little while, but we're here!
English

@ScottieFoxTTV But my workstation is on the other side of my room. It's too far. 🥴
English




