Easily define json convertible enum types in python.
Working on real world data, we will often encounter data values that are encoded as strings, when they actually are contrained to certain set of valid states. In these situations using an enumerable type has the advantage to directly contrain the possible states of our value to the actually valid states.
While this works great conceptually, as soon as we start working on more complex programs with data IO needs, enums can be quite difficult to work with, as we will need to convert them each time we interact with a storage medium.
If your needs are simple, storing data as JSON might be straightforward, but the following python script will fail:
# example.py
import json
import enum
class Color(enum.Enum):
RED = "red"
GREEN = "green"
YELLOW = "yellow"
raw_a = "red"
a = Color(raw_a)
with open("test.json", "w") as f:
json.dump(a, f)
$ python example.py
Traceback (most recent call last):
File "example.py", line 12, in <module>
json.dump(a, f)
File "/usr/local/Cellar/python@2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 189, in dump
for chunk in iterable:
File "/usr/local/Cellar/python@2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 442, in _iterencode
o = _default(o)
File "/usr/local/Cellar/python@2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 184, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: <Color.RED: 1> is not JSON serializable
Our custom enum type unfortunately is not JSON serializable! Now this is understandable, as the json module cannot know how we would like to encode the given object as a JSON representation.
Now we could either define a custom JSON encoder, which define our needed conversion, or alternatively we could use multiple inheritance.
This multiple inheritance approach was first described by Justin Carter on Stackoverflow. In this post I attempt to provide a more in-depth explanaition of why it works and whether there are limitations to this approach instead of doing it properly by defining an encoder.
Multiple inheritance fix
Instead of just inheriting from enum.Enum
in our Color enum, we will first
inherit from str
.
# example_mi.py
import json
import enum
class Color(str, enum.Enum):
RED = "red"
GREEN = "green"
YELLOW = "yellow"
raw_a = "red"
a = Color(raw_a)
with open("test.json", "w") as f:
json.dump(a, f)
$ python3 example_mi.py
$ cat test.json
"red"
Success! Now this seems to work properly, as our enum is properly stored as its string value by json loads.
But if anything, this solution should surprise you. Is it safe to use multiple-inheritance here? Could this usage break something else we are doing?
First you should notice, that instead of writing str, enum.Enum
the reversed
order enum.Enum, str
will produce an Error:
$ python3 example_mi_wmro.py
Traceback (most recent call last):
File "/Users/max/Code/arsbrevis/code-samples/python_enum_json/example_mi_wmro.py", line 5, in <module>
class Color(enum.Enum, str):
File "/usr/local/Cellar/[email protected]/3.9.0_3/Frameworks/Python.framework/Versions/3.9/lib/python3.9/enum.py", line 131, in __prepare__
member_type, first_enum = metacls._get_mixins_(cls, bases)
File "/usr/local/Cellar/[email protected]/3.9.0_3/Frameworks/Python.framework/Versions/3.9/lib/python3.9/enum.py", line 521, in _get_mixins_
raise TypeError("new enumerations should be created as "
TypeError: new enumerations should be created as `EnumName([mixin_type, ...] [data_type,] enum_type)`
But why do we get an error here? This has all to do with how Python multiple-inheritance works and how the python enum Class in particular works.
When we use multiple inheritance in Python attributes will be searched left to
right, eg by specifying str, enum.Enum
if any attribute exists in str, we will
not search further in enum.Enum
. So why does the converse not work?
Reading the python3 documentation on enums provides us with the answer: the enum class contains some special behavior for multiple inherintance, which is implemented via a metaclass. This metaclass expects the base class, as in the last class in our list, to be an enum class.
When we use str, enum.Enum
we are actually creating an derived Enum, which
also fully works as their additional datatype. This is an expected usage of the
enum class and as such fully within the scope of the library. Actually the
documentation itself gives str, enum.Enum
as a possible use-case for derived
enumerations.
json encoding
Now how does this mesh with the json encoding itself? Actually the
json-encoder simply does an isinstance
check on the str
class. Thus our
enum is handled as a string. As the derived enum type automatically uses the
repr
and str
methods of their derived data type the actual encoding itself
works as we would expect.
This can be seen in the python-implementation of the encoder.