
Jan Hendrik Metzen
102 posts

Jan Hendrik Metzen
@jan_metzen
Research Scientist at Prior Labs @prior_labs



















Introducing two new tokenizer-free LLM checkpoints from our research lab: TFree-HAT 7B Built on our Hierarchical Autoregressive Transformer (HAT) architecture, these models achieve top-tier German and English performance while processing text on a UTF-8 byte level.

Excited about our release of a collection of byte-level hierarchical autoregressive transformers (HAT). If you care about bringing these type of models into production, checkout out our work-in-progress vLLM fork for HAT: github.com/Aleph-Alpha/vl…


Introducing two new tokenizer-free LLM checkpoints from our research lab: TFree-HAT 7B Built on our Hierarchical Autoregressive Transformer (HAT) architecture, these models achieve top-tier German and English performance while processing text on a UTF-8 byte level.


Excited about our release of a collection of byte-level hierarchical autoregressive transformers (HAT). If you care about bringing these type of models into production, checkout out our work-in-progress vLLM fork for HAT: github.com/Aleph-Alpha/vl…

I wouldn't really consider these to be tokenizer-free tbh. Unlike Hnets, these models are word level. The sequence is turned into words (this is literally called tokenization). Then, the bytes of these words are turned into embeddings, which are then processed by a model.









