1

I want to build a simple "DataSource" class with has attributes named as "data_source_type" and "data_source_path". Where "data_source_type" is a ENUM and "data_source_path" is a string, depending on "data_source_type" I want to set appropriate validations like "ValidFilePath" or "ValidHttpURL" to "data_source_path".

I don't want to write IF-ELSE and have a spaghetti data-source class, I want to leverage "Python Descriptors" or Any other elegant Python constructs which would consider SRP ( Single Responsibility Principle ) and support functional programming construct.

data_source.py

1 import re
2 from enum import Enum
3 import os
4
5
6 class ValidFilePath(object):
7     def __set__(self, obj, val):
8         if not os.path.exists():
9             raise ValueError("Please enter a valid file path")
10             self.__url = val
11
12     def __get__(self, obj, objtype):
13         return self.__url
14
15
16 class ValidHttpURL(object):
17     def __set__(self, obj, val):
18         if (val is None or re.compile(
19                 r'^https?://'  # http:// or https://
20                 r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+[A-Z]{2,6}\.?|'  # domain...
21                 r'localhost|'  # localhost...
22                 r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})'  # ...or ip
23                 r'(?::\d+)?'  # optional port
24                 r'(?:/?|[/?]\S+)$', re.IGNORECASE).search(val) is None):
25             raise ValueError("Please set an valid HTTP(S) URL")
26             self.__url = val
27
28     def __get__(self, obj, objtype):
29         return self.__url
30
31
32 class DataSourceType(Enum):
33     HTTP = 100,
34     LOCAL_FILE = 200,
35     HDFS_FILE = 300
36
37
38 class ValidDataSourceType(object):
39     def __set__(self, obj, val):
40         if val is None or not DataSourceType.__contains__(DataSourceType[val]):
41             raise ValueError("Please set a valid Data Source Type Enum, "
42                              " possible values are -> ", [e.name for e in DataSourceType])
43         self.__data_source_type = DataSourceType[val]
44
45     def __get__(self, obj, objtype):
46         return self.__data_source_type
47
48
49 class DataSource(object):
50     data_source_type = ValidDataSourceType()
51     data_source_path = ValidHttpURL()

In line number 51 as of now I have put "ValidHttpURL", where I want to set the appropriate validation descriptor depending on the "data_source_type"

Expected behavior

ds1 = DataSource()
ds1.data_source_type = 'HTTP'
ds1.data_source_path = 'http://www.google.com'
ds2 = DataSource()
ds2.data_source_type = 'LOCAL_FILE'
ds2.data_source_path = '/var/www/index.html'
print("All is well")

Actual behavior

ds1 = DataSource()
ds1.data_source_type = 'HTTP'
ds1.data_source_path = 'http://www.google.com'
ds2 = DataSource()
ds2.data_source_type = 'LOCAL_FILE'
ds2.data_source_path = '/var/www/index.html'

**ValueError: Please set an valid HTTP(S) URL**

***UPDATED ANSWER ****

  1 import os
  2 import re
  3 from enum import Enum
  4 from weakref import WeakKeyDictionary
  5
  6
  7 def valid_file_path(value):
  8     if not os.path.exists(value):
  9         raise ValueError(value, " is not present. Please make sure the file exists")
 10     return value
 11
 12
 13 def valid_http_url(value):
 14     if (value is None or re.compile(
 15             r'^https?://'  # http:// or https://
 16             r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+[A-Z]{2,6}\.?|'  # domain...
 17             r'localhost|'  # localhost...
 18             r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})'  # ...or ip
 19             r'(?::\d+)?'  # optional port
 20             r'(?:/?|[/?]\S+)$', re.IGNORECASE).search(value) is None):
 21         raise ValueError("Please set an valid HTTP(S) URL")
 22     return value
 23
 24
 25 class DataSourceType(Enum):
 26     NOT_DEFINED = (0, None)
 27     HTTP = (100, valid_http_url)
 28     LOCAL_FILE = (200, valid_file_path)
 29
 30     def __init__(self, enum_id, enum_validator):
 31         self._id = enum_id
 32         self._validator = enum_validator
 33
 34     @property
 35     def validator(self):
 36         return self._validator
 37
 38
 39 class ValidDataSourceType(object):
 40     def __init__(self):
 41         self.default = DataSourceType.NOT_DEFINED
 42         self.values = WeakKeyDictionary()
 43
 44     def __get__(self, instance, owner):
 45         return self.values.get(instance, self.default)
 46
 47     def __set__(self, instance, value):
 48         if value is None or not DataSourceType.__contains__(DataSourceType[value]):
 49             raise ValueError("Please set a valid Data Source Type Enum, "
 50                              " possible values are -> ", [e.name for e in DataSourceType])
 51         self.values[instance] = DataSourceType[value]
 52
 53     def __delete__(self, instance):
 54         del self.values[instance]
 55
 56
 57 class ValidDataSourcePath(object):
 58     def __init__(self, default_data_source_type_field='data_source_type'):
 59         self._default = ''
 60         self._default_data_source_type_field = default_data_source_type_field
 61         self.values = WeakKeyDictionary()
 62
 63     def __get__(self, instance, owner):
 64         return self.values.get(instance, self._default)
 65
 66     def __set__(self, instance, *value):
 67         data_source_type_field = self._default_data_source_type_field
 68         value_to_set = None
 69
 70         if value and len(value) == 1 and isinstance(value[0], str):  # user sent only the value
 71             value_to_set = value[0]
 72         if value and len(value) == 1 and isinstance(value[0], tuple):  # user sent the value , and the validation field
 73             value_to_set = value[0][0]
 74             data_source_type_field = value[0][1]
 75
 76         _data_source_type = getattr(instance, data_source_type_field, None)
 77         if not _data_source_type:
 78             raise ValueError(" Valid source path depends on ValidDataSourceType , "
 79                              " please make sure you have an attribute named ValidDataSourceType")
 80         _data_source_type.validator(value_to_set)
 81         self.values[instance] = value_to_set
 82
 83
 84 class DataSource(object):
 85     data_source_type = ValidDataSourceType()
 86     data_source_path = ValidDataSourcePath()
 87
 88
 89 class SomeOtherDomainModel(object):
 90     data_source_type_ext = ValidDataSourceType()
 91     data_source_path = ValidDataSourcePath()
 92
 93
 94 print(" **************** Scenario 1 - Start **************** ")
 95 ds1 = DataSource()
 96 ds1.data_source_type = 'HTTP'
 97 ds1.data_source_path = "http://www.google.com"
 98 print(ds1.data_source_path)
 99 print(" **************** Scenario 1 - End **************** ")
100
101 print(" **************** Scenario 2 - Start **************** ")
102 ds2 = SomeOtherDomainModel()
103 ds2.data_source_type_ext = 'HTTP'
104 ds2.data_source_path = ("http://www.yahoo.com", 'data_source_type_ext')
105 print(ds2.data_source_path)
106 print(" **************** Scenario 2 - Start **************** ")
3
  • 1
    Basics of my answer (I'll post a full reply when I get the chance): 1) The source type object should contain the name and the validator in one. Enums do allow for this in Python (docs.python.org/3/library/enum.html#planet). 2) make source path into a simple property that runs through the validator stored in the source type. Commented Dec 4, 2017 at 21:51
  • @jacob-zimmerman Update as per suggestion but facing issue at line 62 and 63. Not able to associate a descriptor from ENUM at line 71 because it is not created yet Commented Dec 5, 2017 at 20:43
  • 1
    No, I would write an actual @property instead that uses the data_source_type. If you'd really like to use a custom descriptor, have ValidDataSourcePath assume obj has a data_source_type and use that instead of storing it on itself. It's hacky, but I can't think of anything else. Commented Dec 5, 2017 at 21:05

1 Answer 1

1

So, building off my comments above, here's how DataSource would look (using @property), along with the idea that the validator classes should just be functions that return a boolean of whether the path is valid (and raises an error if it would like) instead of being more descriptors:

class DataSource(object):
    data_source_type = ValidDataSourceType()

    @property
    def data_source_path(self):
        # can put in a check to make sure _data_source_path exists
        return _data_source_path

    @data_source_path.setter
    def data_source_path(self, path):
        if self.data_source_type.validator(path):
            self._data_source_path = path

Descriptors can be difficult to work with and around (I should know; I literally wrote the book), and should be avoided when a simpler solution can be found, so that's why I turned your validators into predicate functions. There is also no shame in using @property instead of a custom descriptor.

Sign up to request clarification or add additional context in comments.

2 Comments

I did not want to build a direct dependency on properties. The scenario was more of , the validation can be provided by an instance member at runtime by a property named as validator. I got it working , update the answer in the question , please provide if you have any recommendation or suggestion
Well, as you can see, there has to be some sort of connection between the two because they work together. Your solution is probably about the best you can do.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.