Python dataclass, what's a pythonic way to validate initialization arguments?

Question

What's a pythonic way to validate the init arguments before instantiation w/o overriding dataclasses built-in init?

I thought perhaps leveraging the __new__ dunder-method would be appropriate?

from dataclasses import dataclass

@dataclass
class MyClass:
    is_good: bool = False
    is_bad: bool = False

    def __new__(cls, *args, **kwargs):
        instance: cls = super(MyClass, cls).__new__(cls, *args, **kwargs)
        if instance.is_good:
            assert not instance.is_bad
        return instance

Have you considered simply disallowing the caller to specify values for both fields? For example, set is_bad: bool = field(init=False), then set self.is_bad = not self.is_good in __post_init__. — chepner
– chepner, Commented Feb 12, 2020 at 1:46
thanks @chepner field(init=False) is also a very cool idea. — tgk
– tgk, Commented Feb 12, 2020 at 4:22

ShadowRanger · Accepted Answer · 2020-02-12 01:30:04Z

65

Define a __post_init__ method on the class; the generated __init__ will call it if defined:

from dataclasses import dataclass

@dataclass
class MyClass:
    is_good: bool = False
    is_bad: bool = False

    def __post_init__(self):
        if self.is_good:
            assert not self.is_bad

This will even work when the replace function is used to make a new instance.

answered Feb 12, 2020 at 1:30

ShadowRanger

158k12 gold badges221 silver badges316 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

uetoyo Over a year ago

Don't use assert in production code. It is removed at runtime when runned with -O flag. Raise the exception instead.

ShadowRanger Over a year ago

@uetoyo: Depends on the circumstances. If the assert is purely for verifying a true "can't be false unless the code is wrong" (rather than "can be false when runtime data is wrong"), then using assert to validate during testing while avoiding the runtime overhead in production is reasonable (there's a reason assert exists after all). The OP chose to use assert, I just mirrored it for lack of information to say it was definitely wrong, so your comment is really better directed at them. Course, to actually avoid runtime overhead, the if should be if __debug__ and self.is_good:

Arne · Accepted Answer · 2023-03-30 15:47:28Z

The author of the dataclasses module made a conscious decision to not implement validators that are present in similar third party projects like attrs, pydantic, or marshmallow. And if your actual problem is within the scope of the one you posted, then doing the validation in the __post_init__ is completely fine.

But if you have more complex validation procedures or play with stuff like inheritance you might want to use one of the more powerful libraries I mentioned instead of dataclass. Just to have something to look at, this is what your example could look like using pydantic:

>>> from pydantic import BaseModel, validator
>>> class MyClass(BaseModel):
...     is_good: bool = False
...     is_bad: bool = False
...
...     @validator('is_bad')
...     def check_something(cls, v, values):
...         if values['is_good'] and v:
...             raise ValueError("Can not be both good and bad now, can it?")
...         return v
...     
>>> MyClass(is_good=True, is_bad=False)  # this would be a valid instance
MyClass(is_good=True, is_bad=False)
>>> MyClass(is_good=True, is_bad=True)   # this wouldn't
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "pydantic/main.py", line 283, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for MyClass
is_bad
  Can not be both good and bad now, can it? (type=value_error)

Evgeniy_Burdin · Accepted Answer · 2020-05-26 08:27:44Z

You can try this:

from dataclasses import dataclass

from validated_dc import ValidatedDC


@dataclass
class MyClass(ValidatedDC):
    is_good: bool = False
    is_bad: bool = False


instance = MyClass()
assert instance.get_errors() is None
assert instance == MyClass(is_good=False, is_bad=False)

data = {'is_good': True, 'is_bad': True}
instance = MyClass(**data)
assert instance.get_errors() is None

data = {'is_good': 'bad_value', 'is_bad': True}
instance = MyClass(**data)
assert instance.get_errors()
print(instance.get_errors())
# {'is_good': [BasicValidationError(value_repr='bad_value', value_type=<class 'str'>, annotation=<class 'bool'>, exception=None)]}

# fix
instance.is_good = True
assert instance.is_valid()
assert instance.get_errors() is None

ValidatedDC: https://github.com/EvgeniyBurdin/validated_dc

Collectives™ on Stack Overflow

Python dataclass, what's a pythonic way to validate initialization arguments?

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related