Technology

Disaggregation: storage decoupled into an independently scalable all-flash pool, linked to compute over a lossless fabric.

Quick answer

What is ZK-Storage's core technology?

Core architecture: Disaggregation + KV-Cache tiered scheduling
Data path: NVMe-oF over RoCEv2 (lossless Ethernet) + GPUDirect
Patent portfolio: 8 (Filed / under examination)
3rd-party median reduction: ~90.9% across 7 metrics (S38)

DISAGGREGATION

Disaggregated architecture

Compute pool ⟷ lossless fabric ⟷ all-flash pool — each scaling independently.

Compute pool

GPU / NPU nodes

Ascend Atlas 910B

Training / inference frameworks (transparent)

Lossless fabric
NVMe-oF · RDMA / RoCE

All-flash pool

EBOF flash array

CPFS parallel file system

KV-cache acceleration layer

Data moves directly between storage and GPU memory; compute and capacity scale independently.

FOUR PILLARS

Four core technologies

Each maps directly to a shortened data path.

NVMe-oF over RDMA / RoCE

Carry NVMe over remote direct memory access, bypassing redundant copies to approach local-disk performance.

GPUDirect

Data moves directly between storage and GPU memory, shortening the path and cutting CPU and latency overhead.

All-flash EBOF

Controller-less, high-density flash pool; bandwidth and IOPS scale near-linearly with capacity, at lower power.

KV-cache scheduling

Offload and reuse KV cache for long-context / high-switch inference, lifting effective GPU utilization.

Why KV cache is the key to cheaper inference

Long contexts and model switching rebuild KV cache repeatedly, consuming memory and time. Offloading / reusing it to fast storage cuts online-workload cost by up to ~73.7% in industry and internal tests.^S5

VS. NFS

Versus the NFS baseline

Third-party results on the same Ascend platform and workload (excerpt).

Metric	NFS baseline	ZK-Storage WS5000	Gain
DeepSeek-32B model load	563.85 s	6.62 s	85.17×
Training checkpoint load	131.37 s	10.55 s	12.45×
Token throughput (40 switches/day)	21.7%	99.1%	+356.9%

Self-controlled, domestic-ready

Deeply optimized for Huawei Ascend and domestic accelerators with 90%+ coverage; AMD and xFusion adaptation in testing (subject to final reports). Meets sovereignty needs of enterprises and AI centers.

INTELLECTUAL PROPERTY

Core-technology patent portfolio

Across disaggregation, GPU-direct, lossless fabric and KV-cache scheduling — 8 invention patents filed.

Filed / under examination

Disaggregated all-flash storage expansion system and method

Disaggregation · all-flash scale-out

Filed / under examination

NVMe-based all-flash storage cluster construction method and system

NVMe all-flash cluster

Filed / under examination

RoCEv2-based lossless network transmission method, system and storage medium

RoCEv2 lossless fabric

Filed / under examination

AI-based storage-network link optimization method and computer device

AI storage-link optimization

Filed / under examination

GPU-memory data pass-through storage system and method

GPU-memory pass-through

Filed / under examination

Dynamic-routing storage-network load-balancing method and system

Dynamic-routing load balancing

Filed / under examination

Active-active controller storage fault-tolerance control system and method

Active-active fault tolerance

Filed / under examination

Inference-oriented KV-Cache tiered-scheduling method and system

KV-Cache tiered scheduling

IP statement (fact-based)

These 8 are the company’s invention-patent filings around ZK-Storage core technology, all currently filed / under examination. We deliberately show no patent numbers here and will publish real certificate numbers once granted — no overstatement. Separately, our Chief Scientist has personally filed 67 invention patents with 34 granted (incl. 2 US); see the Chief Scientist section on the Company page.

Benchmark it on your own workload

2 live demo units are ready for immediate PoC. Let the data do the talking.

Request a PoC → Contact us

Last updated：2026-06-24