📜 ⬆️ ⬇️

Descriptor guide

Short review


In this article I will talk about descriptors, descriptors protocol, and show how descriptors are called. I will describe the creation of my own and explore several built-in descriptors, including functions, properties, static methods and class methods. Using a simple application, I will show you how each of them works, and give equivalents to the internal implementation of the operation of descriptors by code on pure python.

Learning how the descriptors work will open up access to more working tools, help you better understand how the python works, and feel the elegance of its design.


Introduction and Definitions


Generally speaking, a descriptor is an attribute of an object with bound behavior (i.e. binding behavior), i.e. one whose access behavior is overridden by descriptor protocol methods. These methods are: __get__ , __set__ and __delete__ . If at least one of these methods is defined for an object, then it becomes a handle.
')
The standard behavior when accessing attributes is getting, setting and deleting an attribute from the object dictionary. For example, ax has the following attribute search chain: a.__dict__['x'] , then in type(a).__dict__['x'] , and further along the base classes type(a) not including metaclasses. If the desired value is an object in which there is at least one of the methods that define the descriptor, then the python can change the standard search chain and call one of the descriptor methods. How and when this happens depends on which descriptor methods are defined for the object. Descriptors are called only for objects or classes of a new style (a class is such if it inherits from object or type ).

Descriptors are a powerful protocol with a wide range of applications. They are the mechanism behind the properties, methods, static methods, class methods, and the call to super() . Inside the python itself, with their help, classes of a new style are implemented, which were introduced in version 2.2. Descriptors simplify understanding of the underlying C code, and also provide a flexible set of new tools for any programs on python.

Handle Protocol


 descr.__get__(self, obj, type=None) -->  descr.__set__(self, obj, value) --> None descr.__delete__(self, obj) --> None 

Actually that's all. Define any of these methods and the object will be treated as a descriptor, and will be able to override the standard behavior if it is searched for as an attribute.

If the object immediately identifies both __get__ and __set__ , then it is considered a data descriptor (data descriptor). Descriptors that define only __get__ are called no-data descriptors. They are called so because they are used for methods, but other ways of using them are also possible.

Data descriptors and not data differ in how the search behavior will be changed if there is already an entry in the object dictionary with the same name as the descriptor. If a data descriptor is encountered, it is called before the entry from the object dictionary. If the data descriptor is in the same situation, the entry from the object dictionary takes precedence over this descriptor.

To create a read-only data descriptor, define both __get__ and __set__ , and have __set__ throw an AttributeError exception. Defining the __set__ method and throwing an exception is enough for this descriptor to be considered a data descriptor.

Calling Handles


The handle can be called directly through its method. For example, d.__get__(obj) .

However, the most frequent variant of invoking a descriptor is an automatic invocation during access to an attribute. For example, obj.d searches for d in the dictionary obj . If d specifies the __get__ method, then d.__get__(obj) will be called. The call will be made according to the rules described below.

Call details differ from what an obj object or class is. In any case, descriptors work only for objects and classes of the new style. A class is a new style class if it is a descendant of object .

For objects, the algorithm is implemented using object.__getattribute__ , which converts the bx record to type(b).__dict__['x'].__get__(b, type(b)) . The implementation works through the predecessor chain, in which data descriptors take precedence over object variables, object variables take precedence over non-data descriptors, and the __getattr__ method has the lowest priority, if defined. The full C implementation can be found in PyObject_GenericGetAttr() in the Objects/object.c .

For classes, the algorithm is implemented using type.__getattribute__ , which converts the Bx record to B.__dict__['x'].__get__(None, B) . On pure python, it looks like this:
 def __getattribute__(self, key): " type_getattro()  Objects/typeobject.c" v = object.__getattribute__(self, key) if hasattr(v, '__get__'): return v.__get__(None, self) return v 

Important parts to remember:
The object that is returned after calling super() also has its own implementation of the __getattribute__ method, with which it calls the descriptors. The call super(B, obj).m() searches in obj.__class__.__mro__ base class A , followed immediately by B , and returns A.__dict__['m'].__get__(obj, A) . If it is not a descriptor, then m returned unchanged. If m not in the dictionary, then we return to the search through object.__getattribute__ .

Note: in python 2.2, super(B, obj).m() called __get__ only if m was a data descriptor. In Python 2.3, no data descriptors are also called, except when using old-style classes. Implementation details can be found in super_getattro() in the Objects/typeobject.c , and the equivalent on pure python can be found in the Guido manual .

The details above describe that the callback algorithm is implemented using the __getattribute__() method for object , type and super . Classes inherit this algorithm when they inherit from object or if they have a metaclass that implements this functionality. Thus, classes can disable the invocation of descriptors if they override __getattribute__() .

Descriptor example


The following code creates a class whose objects are data descriptors and all they do is type a message for each get or set call. Overriding __getattribute__ is an alternative approach with which we could do this for each attribute. But if we want to monitor only individual attributes, then this is easier to do with a descriptor.
 class RevealAccess(object): """ ,     ,     ,     . """ def __init__(self, initval=None, name='var'): self.val = initval self.name = name def __get__(self, obj, objtype): print '', self.name return self.val def __set__(self, obj, val): print '' , self.name self.val = val >>> class MyClass(object): x = RevealAccess(10, 'var "x"') y = 5 >>> m = MyClass() >>> mx  var "x" 10 >>> mx = 20  var "x" >>> mx  var "x" 20 >>> my 5 

This simple protocol provides simply fascinating features. Some of them are so often used that they were combined into separate functions. Properties, related and unrelated methods, static methods and class methods are all based on this protocol.

Properties


Calling property() enough to create a data descriptor that calls the functions you need during attribute access. Here is its signature:
 property(fget=None, fset=None, fdel=None, doc=None) --> ,   

The documentation shows a typical use of property() to create a managed attribute x :
 class C(object): def getx(self): return self.__x def setx(self, value): self.__x = value def delx(self): del self.__x x = property(getx, setx, delx, "  'x'.") 

Here is the equivalent of property on pure python, so that it is clear how property() implemented using the descriptor protocol:
 class Property(object): " PyProperty_Type()  Objects/descrobject.c" def __init__(self, fget=None, fset=None, fdel=None, doc=None): self.fget = fget self.fset = fset self.fdel = fdel self.__doc__ = doc def __get__(self, obj, objtype=None): if obj is None: return self if self.fget is None: raise AttributeError, " " return self.fget(obj) def __set__(self, obj, value): if self.fset is None: raise AttributeError, "   " self.fset(obj, value) def __delete__(self, obj): if self.fdel is None: raise AttributeError, "   " self.fdel(obj) 

The built-in property() implementation can help when an attribute access interface existed and some changes occurred that resulted in the intervention of the method.

For example, a spreadsheet class can give access to a cell's value via Cell('b10').value . As a result of subsequent changes in the program, it was necessary to ensure that this value is recalculated with each access to the cell, but the programmer does not want to change the client code that accesses the attribute directly. This problem can be solved by wrapping the value attribute with a data descriptor that will be created with property() :
 class Cell(object): . . . def getvalue(self, obj): "     " self.recalc() return obj._value value = property(getvalue) 

Functions and methods


In python, all object-oriented features are implemented using a functional approach. This is done completely unnoticed using no data descriptors.

Class dictionaries store methods as functions. When defining a class, the methods are written using def and lambda - standard tools for creating functions. The only difference between these functions and ordinary ones is that the first argument is reserved for an object instance. This argument is usually called self , but it can be called this or any other word that can be called variables.

In order to support __get__ methods, the functions include the __get__ method, which automatically makes them descriptors of no data when searching for attributes. Functions return related or unrelated methods, depending on what this descriptor was called through.
 class Function(object): . . . def __get__(self, obj, objtype=None): " func_descr_get()  Objects/funcobject.c" return types.MethodType(self, obj, objtype) 

Using the interpreter, we can see how the function handle actually works:
 >>> class D(object): def f(self, x): return x >>> d = D() >>> D.__dict__['f'] #     <function f at 0x00C45070> >>> Df #       <unbound method Df> >>> df #        <bound method Df of <__main__.D object at 0x00B18C90>> 

The interpreter's output tells us that related and unrelated methods are two different types. Even if they could be implemented in this way, in fact, the PyMethod_Type implementation in the Objects/classobject.c contains a single object with two different mappings that depend only on whether there is a value in the im_self field or whether it contains NULL (C equivalent None values).

Thus, the effect of calling a method depends on the im_self field. If it is set (i.e., the method is bound), then the original function (stored in the im_func field) is called, as expected, with the first argument set to the value of the object instance. If it is not connected, then all arguments are passed without changing the original function. The real C implementation of the instancemethod_call() bit more complicated because it involves some type checks and the like.

Static methods and class methods


No data descriptors provide a simple mechanism for various options to bind functions to methods.

Repeat again. Functions have a __get__ method, with which they become methods, during the search for attributes and automatic calling of handles. No data obj.f(*args) convert the obj.f(*args) call obj.f(*args) to the f(obj, *args) call, and the klass.f(*args) call becomes the f(*args) call.

This table shows the binding and the two most popular options:
TransformationCalled through objectCalled through class
Descriptorfunctionf (obj, * args)f (* args)
staticmethodf (* args)f (* args)
classmethodf (type (obj), * args)f (klass, * args)

Static methods return function without changes. Calls to cf or Cf equivalent to calls to object.__getattribute__(c, "f") or object.__getattribute__(C, "f") . As a result, the function is equally accessible from both the object and the class.

Good candidates for static methods are methods that do not need a reference to the self variable.

For example, a package for statistics may include a class for experimental data. The class provides the usual methods for calculating the mean, expectation, median, and other statistics that depend on the data. However, there may be other functions that are conceptually related, but do not depend on data. For example, erf(x) is a simple conversion function that is needed in statistics, but does not depend on the specific data set in this class. It can be called both from an object and from a class: s.erf(1.5) --> 0.9332 or Sample.erf(1.5) --> 0.9332 .

Since staticmethod() returns a function unchanged, this example is not surprising:
 >>> class E(object): def f(x): print x f = staticmethod(f) >>> print Ef(3) 3 >>> print E().f(3) 3 

If you use the data descriptor protocol, then on a pure python staticmethod() would look like this:
 class StaticMethod(object): " PyStaticMethod_Type()  Objects/funcobject.c" def __init__(self, f): self.f = f def __get__(self, obj, objtype=None): return self.f 

In contrast to static methods, class methods substitute a reference to a class at the beginning of a function call. The format of the call is always the same, and it does not depend on whether we call the method through an object or through a class.
 >>> class E(object): def f(klass, x): return klass.__name__, x f = classmethod(f) >>> print Ef(3) ('E', 3) >>> print E().f(3) ('E', 3) 

This behavior is convenient when our function always needs a reference to a class and does not need data. One way to use classmethod() is to create alternative class constructors. In Python 2.3, the class method dict.fromkeys() creates a new dictionary from the list of keys. The equivalent on pure python will be:
 class Dict: . . . def fromkeys(klass, iterable, value=None): " dict_fromkeys()  Objects/dictobject.c" d = klass() for key in iterable: d[key] = value return d fromkeys = classmethod(fromkeys) 

Now a new dictionary of unique keys can be created in the following way:
 >>> Dict.fromkeys('abracadabra') {'a': None, 'r': None, 'b': None, 'c': None, 'd': None} 

If you use the data descriptor protocol, then on a pure python the classmethod() would look like this:
 class ClassMethod(object): " PyClassMethod_Type()  Objects/funcobject.c" def __init__(self, f): self.f = f def __get__(self, obj, klass=None): if klass is None: klass = type(obj) def newfunc(*args): return self.f(klass, *args) return newfunc 

Source: https://habr.com/ru/post/122082/


All Articles