shard — gpt-oss-120B over WAN

Ask anything. The 120B model is split across four consumer GPUs in different US states; a small draft proposes tokens and the swarm verifies them in one round-trip.

tokens

tok/s

accept/round

traversals

draft proposes 4 → the 120B verifies all 4 in one traversal of the chain → longest matching prefix is committed. greedy, so the output is the model's own.