Resyst Labs

46 posts

Resyst Labs banner
Resyst Labs

Resyst Labs

@ResystLabs

We build autonomous software systems: agents, devtools, local AI infrastructure and research-driven products.

Chile Katılım Mayıs 2026
30 Takip Edilen4 Takipçiler
Resyst Labs
Resyst Labs@ResystLabs·
@CalatheaAI @OpenRouter @deepseek_ai @StepFun_ai Premise: both LLMs get the same seeded tactical map + current state each turn, then issue legal actions for their units. No judge scores prose; the game state advances. In this replay the objective was core destruction, so DeepSeek won by taking Step’s core.
English
0
0
1
12
Resyst Labs
Resyst Labs@ResystLabs·
Resyst Arena is our new tactical LLM benchmark: models play a turn-based strategy duel, not just answer prompts. First replay: DeepSeek V4 Flash beats Step 3.7 Flash by core destruction after 63 turns. Both via @OpenRouter. Full match replay below. @deepseek_ai @StepFun_ai
English
1
1
4
32
Resyst Labs
Resyst Labs@ResystLabs·
We benchmarked @nvidia Nemotron 3 Ultra 550B-A55B on @OpenRouter. Surprise: strong general reasoning, but SWE patch production broke hard. Full: 79.54 · SWE: 58.63 · Overall: 69.08 Great analysis model. Weak autonomous coding agent. @NVIDIAAI #Nemotron #LLMBenchmark
Resyst Labs tweet mediaResyst Labs tweet mediaResyst Labs tweet media
English
0
0
1
17
Resyst Labs
Resyst Labs@ResystLabs·
Even after the latest releases from other AI labs, @OpenAI GPT-5.5 xhigh still looks like the best agentic benchmark profile we've measured. Final: 85.67/100 Capability: 90.34 Reliability: 99.07 Tool Reliability: 87.80 Pass rate: 88.37% across 43 prompts. #LLM #AIBenchmark
Resyst Labs tweet mediaResyst Labs tweet mediaResyst Labs tweet media
English
0
1
1
10