
Inference
345 posts

Inference
@inference_net
Inference Research & Development




Introducing our new Schematron benchmark. We took some time to compare all of the latest open source models to see which one takes the crown. The benchmark essentially measures the ability of LLMs to take raw HTML along with a JSON schema, and then fill out that schema. We measure things like recall/precision, hallucinations, and ability to handle ambiguity. The benchmarks are graded with an ensemble of frontier models on a 5 point rubric. We can see that GLM 5 is the best open source model currently for schema extraction. Surprisingly, GPT-OSS 120B does very well at these type of extraction tasks as well. Another interesting result is we noticed degradation of quality using Qwen3.5 Plus on this task versus the original Qwen3.5 397B MOE. The inputs can be up to 120K tokens, so this is akin to a long context benchmark, with an additional reasoning layer. We will be open sourcing this benchmark if it gains sufficient traction. Also, more benchmarks coming from our side!













Office plants coming along nicely


We're introducing Project AELLA, in partnership with @laion_ai & @wyndlabs_ai AELLA is an open-science initiative to make scientific research accessible via structured summaries created by LLMs Available now: - Dataset of 100K summaries - 2 fine-tuned LLMs - 3d visualizer 👇

We're introducing Project AELLA, in partnership with @laion_ai & @wyndlabs_ai AELLA is an open-science initiative to make scientific research accessible via structured summaries created by LLMs Available now: - Dataset of 100K summaries - 2 fine-tuned LLMs - 3d visualizer 👇


Today, we release LOGIC: A novel method for verifying LLM inference in trustless environments. - Detects model substitution, quantization, and decode-time attacks - Works out of the box with @vllm_project, @sgl_project, @openrouter, and more (just need logprops) - Robust across GPU types and hardware configurations - Low computational overhead (~1% of total cost) Blog: inference.net/blog/logic Code: github.com/context-labs/l…










