Snake_Byte #32: Default behaviors with dictionaries

Due to its usefulness, nearly all modern programming languages have a key/value collection type. Python's flavor of this type is called a dict (dictionary), and Python lets you utilize it in the usual ways that you would expect:

As seen in the example above, retrieving a value for a key is trivial. However, you might be wondering what happens if you try to retrieve a value for a key that doesn't exist in your dict:

Python's default behavior is to raise a KeyError exception, so it is often useful to verify that a key exists before attempting to retrieve its value:

Python also provides a handy get method for dicts. Calling get on a dict will either retrieve the value for the specified key if it exists, or return None (or the value specified in the 2nd parameter) if the specified key doesn't exist. For example:

Most of the time, the usage outlined above is enough to cover all of your needs with regard to setting, retrieving, and removing items in a dict. But what if you found yourself in a situation where you wanted more flexibility over how dicts behave when trying to reference a key that doesn't exist? That's where the special __missing__ method comes in.

To demonstrate, let's imagine that you are working on a web service and you want to use a dict to keep a count of the number of requests per IP address. The traditional way to accomplish this could look something like:

But by implementing our own custom dict class that utilizes the __missing__ method, we can simplify the counter increment portion of our code:

That looks a bit more straightforward doesn't it?

Wouldn't it be nice though if we didn't have to build our own custom dict classes for this behavior? Well Python agrees with you, and that's where collections.defaultdict() comes in.

collections.defaultdict() is simply a variant of the dict class that has its own __missing__ behavior preconfigured for you. By passing in a default factory as the first parameter, you can specify how the dict should behave if a specified key is not found.

For example, in the above ip_counts code we could simply use collections.defaultdict(int) instead of creating our own IPCounter class, but let's try something slightly more interesting. Continuing with the web service example, let's assume that you want to use a dict to store a list of request timestamps for each ip address. The traditional way to do this could look something like:

With collections.defaultdict(), however, we can give our dict a list default factory for when it encounters a missing key, and simplify the above to:

Now our implementation is more concise and readable, with no need for the overhead of a new custom class.

For more examples of its usage, check out the collections.defaultdict documentation.

About sethdavis

Seth is the Director of API development at PokitDok. He has been talking to computers in various languages since the early 90s, and has a passion for all things Linux, Python, and Charleston.

View All Posts