Storing integers in Redis

· by joe · Read in about 2 min · (366 Words)

I’ve been looking into Redis. I wondered about storing integers as keys and values rather than plain old strings. After asking on Stackoverflow, I did my own experiments.

It looks like it is possible to use any byte string as a key.

For my application’s case it actually didn’t make that much difference storing the strings or the integers. I imagine that the structure in Redis undergoes some kind of alignment anyway, so there may be some pre-wasted bytes anyway. The value is hashed in any case.

I used Python for my testing, so I was able to create the values using the struct.pack function. long longs weigh in at 8 bytes, which is quite large if there are a lot of them. Given the distribution of integer values, I discovered that it could actually be advantageous to store the strings, especially when coded in hex.

As redis strings are ‘Pascal-style’:

struct sdshdr {
  long len;
  long free;
  char buf[];
};

and given that we can store anything in there, I did a bit of extra Python to code the type into the shortest possible type:

def do_pack(prefix, number):
    """
    Pack the number into the best possible string. With a prefix char.
    """ 

    # char
    if number < (1 << 8*1):
        return pack("!cB", prefix, number)

    # ushort
    elif number < (1 << 8*2):
        return pack("!cH", prefix, number)

    # uint
    elif number < (1 << 8*4):
        return pack("!cI", prefix, number)

    # ulonglong
    elif number < (1 << 8*8):
        return pack("!cQ", prefix, number)

This appears to make an insignificant saving (or none at all). Probably due to boundary alignment in Redis. This also drives Python CPU through the roof, making it somewhat unattractive.

The data I was working with was 200000 zsets of consecutive integer => {(weight, random integer) × 100}, plus some inverted index (based on random data). dbsize yields 1,200,001 keys.

Final memory use of server: 1.28 GB RAM, 1.32 Virtual. Various tweaks made a difference of no more than 10 megabytes either way.

So my conclusion:

Don’t bother encoding into fixed-size data types. Just store the integer as a string, in hex if you want. It won’t make all that much difference.

References:

http://docs.python.org/library/struct.html

http://redis.io/topics/internals-sds

Read more