nshepperd
437 posts


👋 Hi I'm Boris and I work on Claude Code. I am going to start being more active here on X, since there are a lot of AI and coding related convos happening here.
Feel free to tag me with Claude Code feedback or bug reports. Love to hear how y'all are using Claude Code, and what we can do to make it even better.
English

@itschloebubble it's so sad just being consumers of some corporation's new product :(
English

@DigThatData @epiqueras1 Surprisingly I've had like no issues with fixed shapes? I guess it prevents you from doing dropless MoE in the obvious way, as someone mentioned, but you can turn it into a block sparse matmul. And I like fixed shape because it lets you do cost based graph optimizations.
English

I’d much rather code around fixed shapes than have to manually deal with scheduling and memory space assignment.
Still, I think partial support of ragged shapes could be the optimal point for JAX.
What are some cases where you’ve written off JAX for fixed shape reasons. I bet we can suggest good workarounds for most these days.
English

I got my colab notebook to work again. Sadly with the limits on GPU time colab isn't really useful any more (and they broke CLIP, somehow). But I guess you can use it with a local runtime. colab.research.google.com/drive/12Fs-aVs…




English
nshepperd retweetledi

Hourglass + Diffusion = ❤️
We introduce a new transformer backbone for diffusion models that can directly generate megapixel images without the need for multiple stages like latent diffusion.
Read here! → arxiv.org/abs/2401.11605
Project page → crowsonkb.github.io/hourglass-diff…

English
nshepperd retweetledi
nshepperd retweetledi

@benscottpye mhm, this is a separate pixel art diffusion i've been training myself recently
English
nshepperd retweetledi
nshepperd retweetledi
nshepperd retweetledi
























