Apex Compute ری ٹویٹ کیا

We’ve released hardware v1.1 of the Apex Compute Unified Engine. If you have an FPGA from us, please update to the latest bin file version in the repo. We also added many new models. Here are the updates in this release, and many more hardware optimizations/features are coming in upcoming releases.
RTL updates
• Unified activation pipeline — GELU, SiLU, sigmoid, tanh implemented via a single (a+x)*sigmoid(-b*x) hardware block with configurable a and b parameters. ReLU is included as well.
• Software reset added
• Add-reduce block optimization — Used multi-input FP adder, latency reduced from 21 to 12 cycles, >45% reduction in LUT/FF total.
• Argmax — top-4 selection for MoE support
• Instruction set improvements — instruction queue replaced with instruction direct-mapped cache, supports absolute/relative jumps, this significantly reduced microcode size for matmul kernels and almost no instruction DMA overhead.
• Timing — Kintex-7 frequency increased to 194 MHz
Other FPGAs / Multi-Engine support
• Bittware (Kintex UltraScale 15P) — new project, 400 MHz engine
• Alveo U50 — stabilized build, HBM AXI tuning, 280 MHz engine and 450 MHz HBM, 8 engines synthesized
• Kintex-7 — dual-engine build, 194 MHz target, updated address map
Test infrastructure
• Fmax debug test
• Image patching test
New models
• Llama 3.2 1B
• GPT-2
• Qwen 3 1.7B
• SmolVLM2 500M
• Parakeet
• Swin Transformer
Apex Compute - Unified Engine Repo: github.com/apex-compute/u…
English
