X

Redis for .NET Developers – Redis HyperLogLog Datatype

Redis

Redis HyperLogLog is used for calculating the cardinality of a set or non mathematically speaking it is a probabilistic data structure used to count unique values. Basically it is an algorithm that use randomization such that it can provide an approximation of the number of unique elements in a set by just using a constant complexity, and a fixed/small amount of memory footprint. Each key stored in hyperloglog is only about 12 kbytes per key and with a standard error of 0.81%. There is no limit to the number of items you can count in the HyperLogLog Datatype, unless you approach 2^64 items. If you are planning to use some form of unique count, e.g ipadress

Redis HyperLogLog Datatype – Operations

  • PFADD: Adds or updates one or more members to HyperLogLog, Returns 1 if at least 1 HyperLogLog internal register was altered, 0 otherwise O(1)
  • PFCOUNT: Returns the approximated cardinality computed by the HyperLogLog, 0 if the variable does not exist. O(1).
  • PFMERGE: Merge multiple HyperLogLog values into an unique value O(N)

C# code using Redis HyperLogLog Datatype

  class Program
    {
        static void Main(string[] args)
        {
            var redis = RedisStore.RedisCache;

            RedisKey key = "setKey";
            RedisKey alphaKey = "alphaKey";            
            RedisKey destinationKey = "destKey";

            redis.KeyDelete(key, CommandFlags.FireAndForget);
            redis.KeyDelete(alphaKey, CommandFlags.FireAndForget);
            redis.KeyDelete(destinationKey, CommandFlags.FireAndForget);

            redis.HyperLogLogAdd(key, "a");
            redis.HyperLogLogAdd(key, "b");
            redis.HyperLogLogAdd(key, "c");

            Console.WriteLine(redis.HyperLogLogLength(key)); //output 3


            redis.HyperLogLogAdd(alphaKey, "1");
            redis.HyperLogLogAdd(alphaKey, "2");
            redis.HyperLogLogAdd(alphaKey, "3");


            Console.WriteLine(redis.HyperLogLogLength(alphaKey)); //output 3


            redis.HyperLogLogAdd(alphaKey, "a");

            Console.WriteLine(redis.HyperLogLogLength(alphaKey)); //output 4


            redis.HyperLogLogMerge(destinationKey, key, alphaKey);


            Console.WriteLine(redis.HyperLogLogLength(destinationKey)); //output 6 which is (a, b, c, 1, 2, 3)

            Console.ReadKey();
        }    
    }
  

So this covers the basic usage of Redish HyperLogLog Datatype, in the next blog post I will cover how to use Pub/Sub in Redis.

For the code please visit
https://github.com/taswar/RedisForNetDevelopers/tree/master/8.RedisHyperLogLog

For previous Redis topics

  1. Intro to Redis for .NET Developers
  2. Redis for .NET Developer – Connecting with C#
  3. Redis for .NET Developer – String Datatype
  4. Redis for .NET Developer – String Datatype part 2
  5. Redis for .NET Developer – Hash Datatype
  6. Redis for .NET Developer – List Datatype
  7. Redis for .NET Developer – Redis Sets Datatype
  8. Redis for .NET Developer – Redis Sorted Sets Datatype
Taswar Bhatti:

View Comments (3)

  • Hi Taswar,
    HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. [Wikipedia].
    Redis is has data in memory, and it's really fast. Why we would need an estimation.
    Thanks

    • One of the reason why redis introduced hyperloglog is due to the memory footprint it uses for a count, is minimal. If you wish to read more on redis hyperloglog I would suggest looking at this blog post http://antirez.com/news/75
      it gives a very good explanation of it.

Related Post