bencode inspired tight self describing serialization format

the bendncode format specification

definitions

  • leb128: store arbitrary size integers as multiples of 7 bits, with 8th bit as continuation
  • sleb128: similar but you write the sign bit at the end

decoding steps

as this is a self descriptive format, you need not know what it must become to read what it is.

the first step of any decode, is to read the tag out. none of the possible variants have a closing tag, merely an opening tag.

  • \x0 (False) and \x1 (True) map straight to a boolean
  • i (Int) uses the sleb128 encoding as defined
  • u (Uint) uses the leb128 encoding
  • f (Float) simply encodes 4 bytes
  • d (Double) encodes 8 bytes
  • z (None) means none, and Some means read the next tag
  • l (List) is encoded as leb128(len), tagged contents, and are not necessarily homogenous.
  • s (String) is encoded as leb128(len), followed by len bytes.
  • m (Map) is encoded as leb128(len), kₙ, vₙ, ..

enum tags

  • all enums are, immediately after their tag, encoded with their index(leb128)
  • v, for newtype variant, is encoded as simply the actual type (which is tagged).
  • x, unit variant, has no further steps.
  • y, struct variant, is encoded identically to Map.
  • n, tuple variant, is encoded identically to List.

flowchart

decode any

tag

False

True

Int

Uint

Float

Double

None

next

List

len

read N

String

len

bytes

Map

len

read pairs

variant index

new

unit

struct

tuple

0x0

0x1

i

u

f

d

z

Some

l

s

m

one of the enum tags

v

x

y

n