Variable Length Quantities

Dynamically Encoding Unsigned Values

Assume an arbitrary sized buffer of arbitrary data, how should the
length be encoded? Is it small (<256)? Prepend with an 8-bit length
field! Will it always be below 16,000? A 16-bit field should suffice.
What about 4 billion? 32-bits then. What if we start combining
transmissions though? We better be safe and use 64-bits (18 billion
billion), but now we're using 8x more space per number. Someone also
noted the normal case would have a lot of 0-bytes in each size field
providing easier cryptanalysis, but you did remember to compress THEN
encrypt, right? Deciding how many bytes to use to describe a field is
always a tradeoff between flexibility (large numbers) and resource
constraints (small representations).
What if we made the length field arbitrary size too?

Smart folks have looked at this problem and found many ways to represent Variable-Length quantities, including 645,000 patents. elicarter (via retracile.net) provided a sample implementation and a good decription of the problem including the risk of non-unique representations. Taking a similar tack, we've developed simple C/C++ and python encoder/decoders to tackle the problem.

Python

C/C++

- Kelson (kelson@shysecurity.com)

Smart folks have looked at this problem and found many ways to represent Variable-Length quantities, including 645,000 patents. elicarter (via retracile.net) provided a sample implementation and a good decription of the problem including the risk of non-unique representations. Taking a similar tack, we've developed simple C/C++ and python encoder/decoders to tackle the problem.

Python

def read(vStream): rval = 0 for idx in range(len(vStream)): rval = (rval * 128) + ord(vStream[idx]) if not ord(vStream[idx]) & 128: return (idx+1,rval) rval = rval - 128 + 1 return (0,rval) def write(num): rval = [num%128] while num >= 128: num = (num/128)-1 last = (num%128)+128 rval.insert(0,last) return ''.join([chr(i) for i in rval])

C/C++

byte *encode_varlint(u64 num, u64 *elen=NULL) { u32 buflen = 1; for(u64 tmp=num/128;tmp > 0;tmp/=128) ++buflen; byte *data = new byte[buflen]; if(elen) *elen = buflen; for(int i=buflen;i>0;i--) { data[i-1] = (num % 128)+128; num = (num/128)-1; } data[buflen-1] -= 128; return data; } u64 decode_varlint(const byte *buf, u64 len, u64 *shift) { u32 i=0; u64 retval = 0; for(;i<len;i++) { retval = (retval * 128) + buf[i]; if(0 == (buf[i] & 128)) break; retval = retval - 128 + 1; } if(shift) *shift = i + 1; return retval; }

- Kelson (kelson@shysecurity.com)