Due to its usefulness, nearly all modern programming languages have a key/value collection type. Python's flavor of this type is called a
dict (dictionary), and Python lets you utilize it in the usual ways that you would expect:
As seen in the example above, retrieving a value for a key is trivial. However, you might be wondering what happens if you try to retrieve a value for a key that doesn't exist in your
Python's default behavior is to raise a
KeyError exception, so it is often useful to verify that a key exists before attempting to retrieve its value:
Python also provides a handy
get method for dicts. Calling
get on a
dict will either retrieve the value for the specified key if it exists, or return
None (or the value specified in the 2nd parameter) if the specified key doesn't exist. For example:
Most of the time, the usage outlined above is enough to cover all of your needs with regard to setting, retrieving, and removing items in a
dict. But what if you found yourself in a situation where you wanted more flexibility over how dicts behave when trying to reference a key that doesn't exist? That's where the special
__missing__ method comes in.
To demonstrate, let's imagine that you are working on a web service and you want to use a
dict to keep a count of the number of requests per IP address. The traditional way to accomplish this could look something like:
But by implementing our own custom
dict class that utilizes the
__missing__ method, we can simplify the counter increment portion of our code:
That looks a bit more straightforward doesn't it?
Wouldn't it be nice though if we didn't have to build our own custom
dict classes for this behavior? Well Python agrees with you, and that's where
collections.defaultdict() comes in.
collections.defaultdict() is simply a variant of the
dict class that has its own
__missing__ behavior preconfigured for you. By passing in a default factory as the first parameter, you can specify how the
dict should behave if a specified key is not found.
For example, in the above
ip_counts code we could simply use
collections.defaultdict(int) instead of creating our own
IPCounter class, but let's try something slightly more interesting. Continuing with the web service example, let's assume that you want to use a
dict to store a
list of request timestamps for each ip address. The traditional way to do this could look something like:
collections.defaultdict(), however, we can give our
list default factory for when it encounters a missing key, and simplify the above to:
Now our implementation is more concise and readable, with no need for the overhead of a new custom class.
For more examples of its usage, check out the collections.defaultdict documentation.