In Python development, you may have heard of the concept of “descriptor”. Since we rarely use it directly, most developers don’t understand its principle. But as you become proficient in using Python and want to advance, it is recommended to understand the principle of descriptors, which will also help you to understand the design ideas of Python at a deeper level.
Descriptor is a new concept introduced in Python 2.2. Descriptors are usually used to implement the underlying functions of the object system, including bound and unbound methods, class methods, and static method characteristics.
In Python, it is allowed to host a class attribute to a class, and this attribute is a descriptor.
For example:
1
2
3
class student:
name = 'John'
print(f 'Name:{student.name}')
We define a student class, and define the class attributes name and age. In addition to directly defining class attributes, we can also define class attributes through classes.
1
2
3
4
5
6
7
8
class student_info:
def __get__(self, name, obj):
self.name = 'John'
return (self.name)
class student:
name = student_info()
print(f 'Name:{student.name}')
In the above example, name is not a specific value, but a class. __get__ method is defined in student_info, which returns a specific value. The property of the object is no longer a specific value, but is handed over to a method to define. The way a class attribute is defined can be easily changed with the use of descriptors.
A class attribute wants to be managed by a class. The internal method of this class cannot be defined casually; it must comply with the “descriptor protocol”.
The Python descriptor protocol is a way that events will happen when attributes are referenced in the model. Python will perform a certain translation of attribute access operations, and the method of this translation is determined by the descriptor protocol. With the help of the descriptor protocol provided by Python, we can use it to implement functions similar to private variables in Python.
The descriptor protocol includes several methods:
__get__ (self, obj, type=None)
Used to access attributes. It returns the value of the attribute. If the attribute is illegal, it can throw a corresponding exception like ValueError. If the attribute does not exist, it will report AttributeError.
__set__ (self, obj, value)
Used to set the property’s values, none will be returned.
__delete__ (self, obj)
Controls the deletion of attributes; none will be returned.
Any object that defines any of the above methods will implement the descriptor protocol, and it will become a descriptor.
We create a descriptor example that contains __get__ and __set__ methods.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Age:
def __init__(self, age = 16):
self.age = age
def __set__(self, obj, age):
if not 10 <= age <= 20:
raise ValueError('Valid age must be in [10, 20]')
self.age = age
def __get__(self, obj, type = None):
return self.age
class Student:
age = Age()
def __init__(self, name, age):
self.name = name
self.age = age
print(f 'Student Name: {self.name}, Age:{self.age}')
S1 = Student("John", 13)
S2 = Student("Mike", 23)
In the above example the class attribute age is a descriptor, and its value depends on the Age class. From the output result, when we get or modify the age attribute, we call the __get__ and __set__ methods of Age.
The Age class is a descriptor. When accessing the age property from an instance of Student, two special methods in the Age class are called. In addition, you can also effectively control the legality of the data (field type, numeric interval, etc.).
When calling Age, __get__ is called, and the parameter obj is an instance of Student;
When calling Student.age, __get__ is called and the parameter obj is None;
When calling S1.age is 13, __set__ is called, the parameter obj is the Student instance, and the value is 13;
When calling S2.age is 23, __set__ fails the verification and throws “ValueError: Valid age must be in [10, 20]”.
Why are the parameters passed in different when called by class or instance in the __get__ method? This requires us to understand how the descriptor works.
The descriptor is triggered when we access the property. When we access the attribute d of the object obj, the triggering process of the descriptor is roughly: first look for the attribute a in the dictionary of the object obj, if the attribute is an object containing __get__() method, then directly call a.__get__ (obj). In short, it is translated by attribute access.
The specific trigger is divided into whether we are accessing class attributes or instance attributes.
Access to instance properties
The key to attribute access translation lies in the __getattribute__ method of the base class object. The call detail is:
type(obj).__dict__['a'].__get__(obj, type(obj))
Access the properties of the class object
The key to attribute access translation lies in the __getattribute__ method of the metaclass type. The call detail is:
cls.__dict__['a'].__get__(None, cls)
In fact, in either case, there is a unified calling logic in Python:
First call __getattribute__ to try to get the result;
If there is no result, call __getattr__.
Here involves the descriptor priority which will be introduced next.
Descriptors can be divided into data descriptors and non-data descriptors.
Data descriptor: defines both __get__() and __set__() methods.
Non-data descriptor: only defines the __get__() method.
Let’s walk through an example to illustrate the difference between them. First, an example of a non-data descriptor.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Non - data descriptor
class Non_Data_Des:
def __init__(self,
default = 'student'):
self.name =
default
def __get__(self, obj, type = None):
print("call non-data descriptor")
return self.name
class Student:
name = Non_Data_Des()
def __init__(self, name):
self.name = name
def __getattribute__(self, item):
print("call __getattribute__")
return super(Student, self)
.__getattribute__(item)
S1 = Student("Mike")
print(S1.name)
The output is:
As we can see, to find an attribute in an object, all start with __getattribute__. In __getattribute__, it will check whether this class attribute is a descriptor, and if it is a descriptor, it will call its __get__ method.
In the above example, only the object with the __get__ method is defined, which is actually a non-data descriptor. When the instance attribute and the non-data descriptor have the same name, the instance attribute will be accessed first. If it can’t be found in an instance, it will be obtained from the non-data descriptor.
Data descriptor:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Data descriptor
class Data_Des:
def __init__(self,
default = 'student'):
self.name =
default
def __set__(self, obj, name):
self.name = name
def __get__(self, obj, type = None):
print("call data descriptor")
return self.name
class Student:
name = Data_Des()
def __init__(self, name):
self.name = name
def __getattribute__(self, item):
print("call __getattribute__")
return super(Student, self)
.__getattribute__(item)
S2 = Student("John")
print(S2.name)
The output is:
When the instance attribute and the data descriptor have the same name, the data descriptor will be accessed first.
In short, the difference between data descriptors and non-data descriptors is that they have different priorities relative to the dictionary of instances.
After talking about data descriptors and non-data descriptors, we also need to understand the search rules of object attributes. When we access an instance attribute, Python will search in a certain order.
__getattribute__ is the entry point for all attribute search, and the order of attribute search internally implemented is as follows:
1)First determine whether the attribute (obj.x) you want to find is a descriptor in the class.
2)If it is a descriptor, check whether it is a data descriptor. If it is a data descriptor, call the __get__ of the data descriptor.
3)If it is not a data descriptor, search it from __dict__, If found, return the result.
4)If it cannot be found in __dict__, then re-judge the type of descriptor.
4.1)If it is a non-data descriptor, call the __get__ of the non-data descriptor;
4.2)If it is a common attribute, look it up from the class attribute, return the result directly;
4.3)If there is no such attribute in the class, an AttributeError exception is thrown.
The order of priority is the data descriptor comes first, the second is the instance dict, then the non-data descriptor, and the last is __getattr__().
That is, if the instance obj reproduces the data descriptor x and instance attribute x with the same name, when we access x, because the data descriptor has a higher priority, Python will call type(obj).__dict__[‘x’] .__get__(obj, type(obj)) instead of returning obj.__dict__[‘x’]. But, if the descriptor is a non-data descriptor, it will return obj.__dict__[‘x’].
In development, although we do not directly use the descriptor, it is used all the time at the bottom, such as function, decorator property, staticmethod, etc. Below we use the characteristics of Python descriptors to implement these decorators, taking property as an example.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class Student:
def __init__(self, name, age):
self.name = name
self.age = age
@property
def age(self):
return self._age
@age.setter
def age(self, age):
if not 10 <= age <= 20:
raise ValueError('Valid age must be in [10, 20]')
self._age = age
def __repr__(self):
return 'Student name:%s, Age:%d' % (self.name, self.age)
A function decorated with property, such as age in the example, will become a property of the Student instance. Assigning a value to the age property will enter the logical code block that uses age.setter to decorate the function.
Next, we directly use the characteristics of Python descriptors to implement property.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class Desc_Property:
def __init__(self, func_get = None, func_set = None, func_del = None):
self.func_get = func_get
self.func_set = func_set
self.func_del = func_del
def __get__(self, obj, objtype = None):
if obj is None:
return self
return self.func_get(obj)
def __set__(self, obj, value):
self.func_set(obj, value)
def __delete__(self, obj):
self.func_del(obj)
def setter(self, func_set):
return type(self)(self.func_get, func_set, self.func_del)
def getter(self, func_get):
return type(self)(func_get, self.func_set, self.func_del)
def deleter(self, func_del):
return type(self)(self.func_get, self.func_set, func_del)
class Student:
def __init__(self, name, age):
self.name = name
self.age = age
@Desc_Property
def age(self):
return self._age
@age.setter
def age(self, age):
if not 10 <= age <= 20:
raise ValueError("Valid age must be in [10, 20]")
self._age = age
def __repr__(self):
return 'Student name:%s, Age:%d' % (self.name, self.age)
After decorating with Desc_Property, age is no longer a function, but an instance of the Desc_Property class. So the second age function can be decorated with age.setter. The essence is to call Desc_Property.setter to generate a new Desc_Property instance and assign it to the second age.
The first age and the second age are two different instances of Desc_Property. But they all belong to the same descriptor class (Desc_Property). When assigning a value to age, it will enter Desc_Property.__set__, and when assigning a value to age, it will enter Desc_Property.__get__. In fact, the final access is the _age attribute of the Student instance.
This article introduces the definition, creation, types and working principle of descriptors. In addition, the process of obtaining an attribute is analyzed. The __getattribute__ method defines the order of finding attributes. Then it analyzes the use of decorator property and how to use Python descriptors to realize property features.
Descriptors can help us achieve powerful and flexible attribute management, and elegant programming can be achieved through the combined use of descriptors.