Working on real world data, we will often encounter data values that are encoded as strings, when they actually are contrained to certain set of valid states. In these situations using an enumerable type has the advantage to directly contrain the possible states of our value to the actually valid states.
While this works great conceptually, as soon as we start working on more complex programs with data IO needs, enums can be quite difficult to work with, as we will need to convert them each time we interact with a storage medium.
If your needs are simple, storing data as JSON might be straightforward, but the following python script will fail:
# example.py import json import enum class Color(enum.Enum): RED = "red" GREEN = "green" YELLOW = "yellow" raw_a = "red" a = Color(raw_a) with open("test.json", "w") as f: json.dump(a, f)
$ python example.py Traceback (most recent call last): File "example.py", line 12, in <module> json.dump(a, f) File "/usr/local/Cellar/python@2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 189, in dump for chunk in iterable: File "/usr/local/Cellar/python@2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 442, in _iterencode o = _default(o) File "/usr/local/Cellar/python@2/2.7.17_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 184, in default raise TypeError(repr(o) + " is not JSON serializable") TypeError: <Color.RED: 1> is not JSON serializable
Our custom enum type unfortunately is not JSON serializable! Now this is understandable, as the json module cannot know how we would like to encode the given object as a JSON representation.
Now we could either define a custom JSON encoder, which define our needed conversion, or alternatively we could use multiple inheritance.
This multiple inheritance approach was first described by Justin Carter on Stackoverflow. In this post I attempt to provide a more in-depth explanaition of why it works and whether there are limitations to this approach instead of doing it properly by defining an encoder.
Multiple inheritance fix
Instead of just inheriting from
enum.Enum in our Color enum, we will first
# example_mi.py import json import enum class Color(str, enum.Enum): RED = "red" GREEN = "green" YELLOW = "yellow" raw_a = "red" a = Color(raw_a) with open("test.json", "w") as f: json.dump(a, f)
$ python3 example_mi.py $ cat test.json "red"
Success! Now this seems to work properly, as our enum is properly stored as its string value by json loads.
But if anything, this solution should surprise you. Is it safe to use multiple-inheritance here? Could this usage break something else we are doing?
First you should notice, that instead of writing
str, enum.Enum the reversed
enum.Enum, str will produce an Error:
$ python3 example_mi_wmro.py Traceback (most recent call last): File "/Users/max/Code/arsbrevis/code-samples/python_enum_json/example_mi_wmro.py", line 5, in <module> class Color(enum.Enum, str): File "/usr/local/Cellar/[email protected]/3.9.0_3/Frameworks/Python.framework/Versions/3.9/lib/python3.9/enum.py", line 131, in __prepare__ member_type, first_enum = metacls._get_mixins_(cls, bases) File "/usr/local/Cellar/[email protected]/3.9.0_3/Frameworks/Python.framework/Versions/3.9/lib/python3.9/enum.py", line 521, in _get_mixins_ raise TypeError("new enumerations should be created as " TypeError: new enumerations should be created as `EnumName([mixin_type, ...] [data_type,] enum_type)`
But why do we get an error here? This has all to do with how Python multiple-inheritance works and how the python enum Class in particular works.
When we use multiple inheritance in Python attributes will be searched left to
right, eg by specifying
str, enum.Enum if any attribute exists in str, we will
not search further in
enum.Enum. So why does the converse not work?
Reading the python3 documentation on enums provides us with the answer: the enum class contains some special behavior for multiple inherintance, which is implemented via a metaclass. This metaclass expects the base class, as in the last class in our list, to be an enum class.
When we use
str, enum.Enum we are actually creating an derived Enum, which
also fully works as their additional datatype. This is an expected usage of the
enum class and as such fully within the scope of the library. Actually the
documentation itself gives
str, enum.Enum as a possible use-case for derived
Now how does this mesh with the json encoding itself? Actually the
json-encoder simply does an
isinstance check on the
str class. Thus our
enum is handled as a string. As the derived enum type automatically uses the
str methods of their derived data type the actual encoding itself
works as we would expect.
This can be seen in the python-implementation of the encoder.