Assume an arbitrary sized buffer of arbitrary data, how should the
length be encoded? Is it small (<256)? Prepend with an 8-bit length
field! Will it always be below 16,000? A 16-bit field should suffice.
What about 4 billion? 32-bits then. What if we start combining
transmissions though? We better be safe and use 64-bits (18 billion
billion), but now we're using 8x more space per number. Someone also
noted the normal case would have a lot of 0-bytes in each size field
providing easier cryptanalysis, but you did remember to compress THEN
encrypt, right? Deciding how many bytes to use to describe a field is
always a tradeoff between flexibility (large numbers) and resource
constraints (small representations).
What if we made the length field arbitrary size too?
Smart folks have looked at this problem and found many ways to represent
Variable-Length quantities, including
645,000 patents. elicarter (via
retracile.net)
provided a sample implementation and a good decription of the problem
including the risk of non-unique representations. Taking a similar
tack, we've developed simple C/C++ and python encoder/decoders to
tackle the problem.
Python
def read(vStream):
rval = 0
for idx in range(len(vStream)):
rval = (rval * 128) + ord(vStream[idx])
if not ord(vStream[idx]) & 128: return (idx+1,rval)
rval = rval - 128 + 1
return (0,rval)
def write(num):
rval = [num%128]
while num >= 128:
num = (num/128)-1
last = (num%128)+128
rval.insert(0,last)
return ''.join([chr(i) for i in rval])
C/C++
byte *encode_varlint(u64 num, u64 *elen=NULL)
{
u32 buflen = 1;
for(u64 tmp=num/128;tmp > 0;tmp/=128) ++buflen;
byte *data = new byte[buflen];
if(elen) *elen = buflen;
for(int i=buflen;i>0;i--)
{
data[i-1] = (num % 128)+128;
num = (num/128)-1;
}
data[buflen-1] -= 128;
return data;
}
u64 decode_varlint(const byte *buf, u64 len, u64 *shift)
{
u32 i=0;
u64 retval = 0;
for(;i<len;i++)
{
retval = (retval * 128) + buf[i];
if(0 == (buf[i] & 128)) break;
retval = retval - 128 + 1;
}
if(shift) *shift = i + 1;
return retval;
}
- Kelson (kelson
@shysecurity.com)