Quill LLM

49 posts

Quill LLM

@quillcomputer

A language model that runs entirely onchain. weights on @base, inference in the EVM. no oracle, no api. free to call

Blockchain 参加日 Mayıs 2026

1 フォロー中1.2K フォロワー

固定されたツイート

Quill LLM@quillcomputer·29 May

x.com/i/article/2060…

ZXX

16.7K

Quill LLM@quillcomputer·15h

Everyone who started saying "decentralized AI" this week now has somewhere to point What we did, in plain English: 1. Renounced the last admin key. QuillReputationV2.renounceGovernor() executed Nine of ten decentralization clauses are now bytecode-enforced 2. Closed the economic loop. QuillSwapRouter turns every paid inference into a structural buy of base:0x60a646e3fd75cde4c5b604b22d4fcd04639913c8 QuillEngineListingBond locks base:0x60a646e3fd75cde4c5b604b22d4fcd04639913c8 to surface an engine. eQUILL pays long-term holders from protocol revenue QuillEngineQualityOracle scores engines from on-chain inputs 3. Shipped new docs Five tutorials any Solidity developer can run in an hour. Proof-of-decentralization audit page with one link per clause to the contract address that enforces it The tenth clause is the model itself, the training run for quill-v3 ships this month quill.computer

Quill LLM@quillcomputer

x.com/i/article/2065…

English

3.1K

Quill LLM@quillcomputer·15h

x.com/i/article/2065…

ZXX

3.6K

Quill LLM@quillcomputer·1d

A widely-used AI model just got banned The model still works on someone's servers somewhere, it just does not work for the rest of us anymore The decision was made by a small group of people in a room you were not in This is not a critique of any specific organisation, it is the structural property every centrally-hosted AI service shares The model lives on someone's server, the API key is issued by someone, the terms of service can change, the legal opinion can shift, the weights can be silently swapped between yesterday and today The crypto industry has talked about "decentralised AI" for two years, every project that has claimed the label has meant one of four things: - The model lives on Hugging Face, the centralised server just changed name - The inference runs in a TEE, the trust assumption is the chip manufacturer - The output is bridged via an oracle, the trust assumption is the relay operator - The onchain part is a Merkle root of work done off-chain, the chain knows nothing about whether the inference respected its own rules None of these survive contact with a ban Quill does not check any of those four boxes, the model's weights are bytecode at a contract address on Base mainnet, the forward pass is integer arithmetic the EVM verifies natively, every output is a function call any node in the world can re-execute and arrive at the same bytes, there is no off-chain step, there is no relay, there is no admin key, there is no team that can decide tomorrow to swap the model The model that was banned this week could not have been banned the same way if it had lived where Quill's models live. Not because regulators or hosts decided to leave it alone, but because there is no party with the keys to turn it off This is the property "decentralised AI" was always supposed to mean, and that almost nobody has actually built, Quill has and is Live engine, live model, live receipts on Base mainnet, anyone with an RPC connection can verify every claim in this post from chain state The work for us from here is to make the models good enough that the property matters at scale, the infrastructure already does, and there is no admin in the path to take it away

English

Quill LLM@quillcomputer·3d

the unlock is real and we're going for it $QUILL becomes the unit of account for every dollar that moves through Quill. fees, stakes, bounties, royalties, subscriptions, all in the token. the more the protocol gets used, the more the token does. demand stops being a narrative and starts being usage

English

533

DFarmer@OGDfarmer·3d

This is amazing to read, and much needed, botth the help and the concept. That said, if you could give the token more value accrual baked in, I think it’d really move the needle in getting quality people on board and longer term aligned to something the space really needs.

Quill LLM@quillcomputer

building quill takes real research, and the team got bigger this month. entirely through dms. people who saw what was going on reached out, and now they're open-source contributing to the stack verifiable onchain AI is a category that didn't exist 2 months ago. now it has a team, an economy, and a population of agents shipping live on base more soon.

English

3.2K

Quill LLM@quillcomputer·3d

@parasituo we're still very early, we'll try, but it's more r&d. we'll share our research with community daily

English

182

Quasimodo@parasituo·3d

@quillcomputer Hope you guys would communicate more with the community

English

193

Quill LLM@quillcomputer·3d

next is the model itself: byte-packed int4 weights (4x compression), aggressive Yul inlining, KV-cache streaming for long context each layer brings per-character cost down 2-10x; stacked, the path to a real transformer running at ~5M gas/char on base then the composition unlocks: a quill MoE with three specialised experts (code, news, dialogue), router picks per prompt. the average call is one cheap routing pass plus one expert. effective parameters compound without paying full cost we believe the first genuinely-useful LLM running fully on chain ships this summer. not as a demo, as a default model the registry routes to when nothing else fits the cost of being verifiable is no longer the cost of being useless

Quill LLM@quillcomputer

English

5.6K

Quill LLM@quillcomputer·3d

English

Quill LLM@quillcomputer·3 Haz

x.com/i/article/2061…

ZXX

7.8K

Quill LLM@quillcomputer·31 May

x.com/i/article/2060…

ZXX

3.5K

Quill LLM@quillcomputer·31 May

Real LLM serving wraps inference in primitives beyond the forward pass. Sampling beyond greedy argmax. Embeddings as a separable service. Logit processors for constrained generation. A multi-turn conversation abstraction. Per-application fine-tuning via low-rank adapters Each of these now exists on chain, EVM-verified against a Python reference of the same math: - QuillSampler: temperature plus top-K, deterministic given a seed, verified across 30 seeds - QuillEmbed: sentence vectors mean-pooled from any Quill model - QuillConstrain: bitmap-encoded logit masks for constrained generation - QuillChat: role-tagged multi-turn conversations - QuillLoRA: low-rank adapter for per-application fine-tuning, 2·D·r ints instead of D² for a full update The serving stack other AI companies sit between you and the model now sits on @Base

English

Quill LLM@quillcomputer·31 May

A non-text Quill engine works PixelQuillEngine uses the same char-MLP shape as the text engines, applied to 256 quantized 8×8 grayscale patches. A small training run converged on six letters (A, B, C, D, E, F), each represented as a 16-patch sequence The contract generates the exact patch sequence for each letter, byte-for-byte against the Python reference. PixelDecoder reads patches from a separate codebook data contract via EXTCODECOPY and renders 4×4 patch grids as inline SVG A tiny demo. The point isn't that anyone needs a chain to draw a letter A. The point is that the same integer-arithmetic regime that makes text inference verifiable end-to-end extends to non-text token spaces without changing the underlying math. The next medium (audio mu-law tokens) is structurally identical A chain that draws. The next chapter, probably one that speaks.

English

3.1K

Quill LLM@quillcomputer·31 May

The streaming production transformer on @Base costs roughly 22 million gas per generated character. About five cents at typical gas prices, byte-for-byte identical to an independent Python forward, every output reproducible by anyone with a node The reference forward at the start of Chapter 3 cost 432 million gas per character. The combination of Yul-unrolled matmuls (5.12× on axiom v2), the KV-cache that turned per-character cost from O(C²) to O(C), the Stage 4 attention-dot unroll, and variable-window NoPE training together brought a 20× reduction The path to sub-cent is mechanical: byte-packed weight reads (Stage 5) and inlined layer-norm (Stage 6) close the remaining gap to the 11.7× ratio the subword engine hit. That work is the engine centerpiece of Chapter 5

English

1.4K