📜 ⬆️ ⬇️

User Attributes in Python

Have you ever thought about what happens when you put a dot in python? What does str (“\ u002E”) hide behind? What secrets does he keep? If without mysticism, do you know how to find and set custom attribute values ​​in python? Would you like to know? Then ... welcome!
To make the time spent reading easy, pleasant, and useful, it would be nice to know a few basic concepts of the language. In particular, the understanding of type and object will be extremely useful, as well as knowledge of several examples of both entities. You can read about them, including here .
A little bit about the terminology I use before we get down to what we have gathered for:

Oh yes, all the examples in the article are written in python3 ! This should definitely be considered.
If none of the above could temper your desire to find out what happens next, let's get started!

__dict__


Attributes of an object can be divided into two groups: certain python-ohms (such as __class__ , __bases__ ) and user-defined, I am going to tell about them. __dict__ according to this classification, refers to the “system” (defined by python) attributes. Its task is to store user attributes. It is a dictionary, in which the key is the name of the attribute , the value, respectively, of the value of the attribute .
To find an attribute of an object o , python scans:
  1. The object itself ( o .__ dict__ and its system attributes).
  2. Object Class ( o .__ class __.__ dict__ ). Only __dict__ class, not system attributes.
  3. Classes from which the class of the object is set ( o .__ class __.__ bases __.__ dict__ ).
Thus, using __dict__, an attribute can be defined both for a specific instance and for a class (that is, for all objects that are instances of a given class).

class StuffHolder: stuff = "class stuff" a = StuffHolder() b = StuffHolder() a.stuff # "class stuff" b.stuff # "class stuff" b.b_stuff = "b stuff" b.b_stuff # "b stuff" a.b_stuff # AttributeError 

The example describes the class StuffHolder with one stuff attribute, which is inherited by both of its instances. Adding b attribute b_stuff to object b does not affect a .
Let's look at __dict__ all the actors:
')
 StuffHolder.__dict__ # {... 'stuff': 'class stuff' ...} a.__dict__ # {} b.__dict__ # {'b_stuff': 'b stuff'} a.__class__ # <class '__main__.StuffHolder'> b.__class__ # <class '__main__.StuffHolder'> 
(The class StuffHolder in __dict__ stores an object of class dict_proxy with a bunch of different junk that you don’t need to pay attention to yet).

Neither a nor b in __dict__ has the stuff attribute, having not found it there, the search engine looks for it in the __dict__ class ( StuffHolder ), successfully finds and returns the value assigned to it in the class. The class reference is stored in the __class__ attribute of the object.
An attribute search occurs at run time, so even after creating instances, all changes to the __dict__ class will be reflected in them:

 a.new_stuff # AttributeError b.new_stuff # AttributeError StuffHolder.new_stuff = "new" StuffHolder.__dict__ # {... 'stuff': 'class stuff', 'new_stuff': 'new'...} a.new_stuff # "new" b.new_stuff # "new" 

In the case of assigning a value to an instance attribute, only the __dict__ instance is changed , that is, the value in the __dict__ class remains unchanged (if the value of the class attribute is not a data descriptor):

 StuffHolder.__dict__ # {... 'stuff': 'class stuff' ...} c = StuffHolder() c.__dict__ # {} c.stuff = "more c stuff" c.__dict__ # {'stuff': 'more c stuff'} StuffHolder.__dict__ # {... 'stuff': 'class stuff' ...} 

If the attribute names in the class and the instance are the same, the interpreter will look up the instance when searching for the value (in case the value of the class attribute is not a data descriptor):

 StuffHolder.__dict__ # {... 'stuff': 'class stuff' ...} d = StuffHolder() d.stuff # "class stuff" d.stuff = "d stuff" d.stuff # "d stuff" 

By and large this is all that can be said about __dict__ . This is a user-defined attribute store. Search in it is made at run time and the search takes into account the __dict__ object class and base classes. It is also important to know that there are several ways to override this behavior. One of them is a great and mighty Handle!

Descriptors


With simple types as attribute values, everything is clear. Let's see how the function behaves in the same conditions:

 class FuncHolder: def func(self): pass fh = FuncHolder() FuncHolder.func # <function func at 0x8f806ac> FuncHolder.__dict__ # {...'func': <function func at 0x8f806ac>...} fh.func # <bound method FuncHolder.func of <__main__.FuncHolder object at 0x900f08c>> 

WTF !? You ask ... maybe. I would ask. How does the function in this case differ from what we have already seen? The answer is simple: using the __get__ method.

 FuncHolder.func.__class__.__get__ # <slot wrapper '__get__' of 'function' objects> 

This method overrides the mechanism for obtaining the value of the func attribute of the fh instance, and the object that implements this method is untranslatablely called a non-data descriptor .

From howto :
A descriptor is an object that is accessed by an attribute redefined by methods in a protocol descriptor :
 descr .__ get __ (self, obj, type = None) -> value (overrides the way to get the attribute value)
 descr .__ set __ (self, obj, value) -> None (overrides the method of assigning a value to an attribute)
 descr .__ delete __ (self, obj) -> None (overrides the way the attribute is deleted)

Descriptors are of two types:
  1. Data Descriptor (data descriptor) - an object that implements the __get __ () and __set __ () method
  2. Non-data Descriptor (no data descriptor?) - an object that implements the __get __ () method
They differ in their behavior in relation to the entries in the __ict__ instance. If __dict__ has an entry with the same name as the data descriptor, the descriptor has an advantage. If the record name is the same as the “no data descriptor” name, the record priority in __dict__ is higher.

Data descriptors

Consider the data descriptor more closely:

 class DataDesc: def __get__(self, obj, cls): print("Trying to access from {0} class {1}".format(obj, cls)) def __set__(self, obj, val): print("Trying to set {0} for {1}".format(val, obj)) def __delete__(self, obj): print("Trying to delete from {0}".format(obj)) class DataHolder: data = DataDesc() d = DataHolder() DataHolder.data # Trying to access from None class <class '__main__.DataHolder'> d.data # Trying to access from <__main__.DataHolder object at ...> class <class '__main__.DataHolder'> d.data = 1 # Trying to set 1 for <__main__.DataHolder object at ...> del(d.data) # Trying to delete from <__main__.DataHolder object at ...> 

It should be noted that the call to DataHolder.data passes the __get__ None method instead of an instance of the class.
Let us check the statement that the date of the descriptors has an advantage over the entries in the __dict__ instance:

 d.__dict__["data"] = "override!" d.__dict__ # {'data': 'override!'} d.data # Trying to access from <__main__.DataHolder object at ...> class <class '__main__.DataHolder'> 

Indeed , an entry in __dict__ of an instance is ignored if there is an entry in the __dict__ class of the instance (or its base class) with the same name and value - a data descriptor.

Another important point. If you change the value of an attribute with a descriptor through a class, no descriptor methods will be called, the value will change in the __dict__ class as if it were a regular attribute:

 DataHolder.__dict__ # {...'data': <__main__.DataDesc object at ...>...} DataHolder.data = "kick descriptor out" DataHolder.__dict__ # {...'data': 'kick descriptor out'...} DataHolder.data # "kick descriptor out" 


No data descriptors

Example of a data descriptor:

 class NonDataDesc: def __get__(self, obj, cls): print("Trying to access from {0} class {1}".format(obj, cls)) class NonDataHolder: non_data = NonDataDesc() n = NonDataHolder() NonDataHolder.non_data # Trying to access from None class <class '__main__.NonDataHolder'> n.non_data # Trying to access from <__main__.NonDataHolder object at ...> class <class '__main__.NonDataHolder'> n.non_data = 1 n.non_data # 1 n.__dict__ # {'non_data': 1} 

Its behavior is slightly different from what the date handle got up to. When trying to assign a value to the non_data attribute, it was recorded in the __dict__ instance, thus hiding the descriptor that is stored in the __dict__ class.

Examples of using

Descriptors are a powerful tool that allows you to control access to the attributes of a class instance. One example of their use is functions, when called via an instance, they become methods (see the example above). Also a common way to use descriptors is to create a property . By property, I mean a certain value characterizing the state of an object, access to which is controlled using special methods (getters, setters). Creating a property is simple using a handle:

 class Descriptor: def __get__(self, obj, type): print("getter used") def __set__(self, obj, val): print("setter used") def __delete__(self, obj): print("deleter used") class MyClass: prop = Descriptor() 

Or you can use the built-in property class, it is a data descriptor. The code presented above can be rewritten as follows:

 class MyClass: def _getter(self): print("getter used") def _setter(self, val): print("setter used") def _deleter(self): print("deleter used") prop = property(_getter, _setter, _deleter, "doc string") 

In both cases, we get the same behavior:

 m = MyClass() m.prop # getter used m.prop = 1 # setter used del(m.prop) # deleter used 

It is important to know that a property is always a data descriptor. If one of the functions (getter, setter or deliter) is not transferred to its constructor, AttributeError will be thrown out if an attempt is made to perform an appropriate action on the attribute.

 class MySecondClass: prop = property() m2 = MySecondClass() m2.prop # AttributeError: unreadable attribute m2.prop = 1 # AttributeError: can't set attribute del(m2) # AttributeError: can't delete attribute 

The built-in descriptors also include:
 class StaticAndClassMethodHolder: def _method(*args): print("_method called with ", args) static = staticmethod(_method) cls = classmethod(_method) s = StaticAndClassMethodHolder() s._method() # _method called with (<__main__.StaticAndClassMethodHolder object at ...>,) s.static() # _method called with () s.cls() # _method called with (<class '__main__.StaticAndClassMethodHolder'>,) 


__getattr __ (), __setattr __ (), __delattr __ () and __getatttribute __ ()


If you need to define the behavior of an object as an attribute , you should use descriptors (for example, property ). The same is true for a family of objects (for example, functions ). Another way to influence access to attributes is: __getattr __ () , __setattr __ () , __delattr __ () and __getatttribute __ () methods . Unlike descriptors, they should be defined for the object containing the attributes and they are called when accessing any attribute of this object.

__getattr __ (self, name) will be called if the requested attribute is not found by the usual mechanism (in __dict__ of an instance, class, etc.):

 class SmartyPants: def __getattr__(self, attr): print("Yep, I know", attr) tellme = "It's a secret" smarty = SmartyPants() smarty.name = "Smartinius Smart" smarty.quicksort # Yep, I know quicksort smarty.python # Yep, I know python smarty.tellme # "It's a secret" smarty.name # "Smartinius Smart" 

__getattribute __ (self, name) will be called when trying to get the value of an attribute. If this method is redefined, the standard attribute value search mechanism will not be used. It should be borne in mind that calling special methods (for example, __len __ () , __str __ () ) through built-in functions or an implicit call using language syntax bypasses __getattribute __ () .

 class Optimist: attr = "class attribute" def __getattribute__(self, name): print("{0} is great!".format(name)) def __len__(self): print("__len__ is special") return 0 o = Optimist() o.instance_attr = "instance" o.attr # attr is great! o.dark_beer # dark_beer is great! o.instance_attr # instance_attr is great! o.__len__ # __len__ is great! len(o) # __len__ is special\n 0 

__setattr __ (self, name, value) will be called when trying to set the value of an instance attribute. Similar to __getattribute __ () , if this method is redefined, the standard value setting mechanism will not be used:

 class NoSetters: attr = "class attribute" def __setattr__(self, name, val): print("not setting {0}={1}".format(name,val)) no_setters = NoSetters() no_setters.a = 1 # not setting a=1 no_setters.attr = 1 # not setting attr=1 no_setters.__dict__ # {} no_setters.attr # "class attribute" no_setters.a # AttributeError 

__delattr __ (self, name) is similar to __setattr __ () , but is used when deleting an attribute.

When overriding __getattribute __ () , __setattr __ () and __delattr __ () it should be borne in mind that the standard way of accessing attributes can be called via object :

 class GentleGuy: def __getattribute__(self, name): if name.endswith("_please"): return object.__getattribute__(self, name.replace("_please", "")) raise AttributeError("And the magic word!?") gentle = GentleGuy() gentle.coffee = "some coffee" gentle.coffee # AttributeError gentle.coffee_please # "some coffee" 


Salt


So, to get the value of the attribute attrname of instance a in python:
  1. If a .__ class __.__ getattribute __ () method is defined, then it is called and the resulting value is returned.
  2. If attrname is a special (python-defined) attribute, such as __class__ or __doc__ , its value is returned.
  3. It is checked a .__ class __.__ dict__ for the presence of an entry with attrname . If it exists and the value is a data descriptor, the result of calling the __get __ () method of the descriptor is returned. Also all base classes are checked.
  4. If a record with the name attrname exists in a .__ dict__ , the value of that record is returned. If a is a class, then the attribute is also searched among its base classes and, if there is a data descriptor there or in __dict__ - the result of the descriptor __get __ () is returned.
  5. It is checked a .__ class __.__ dict__ , if there is an entry with attrname in it and this is “no data descriptor”, the result is __get __ () descriptor, if the entry exists and there is no descriptor, the value of the entry is returned. Base classes are also searched.
  6. If there is a .__ class __.__ getattr __ () method, it is called and its result is returned. If there is no such method, AttributeError is thrown away .

To set the value of the attrname attribute of instance a :
  1. If there is a .__ class __.__ setattr __ () method, it is invoked.
  2. It is checked a .__ class __.__ dict__ , if it has an entry with attrname and it is a data descriptor, the __set __ () method of the descriptor is called. Base classes are also checked.
  3. In a .__ dict__ , a value entry is added with the attrname key.


__slots__


As Guido writes in his python history about how the new-style classes were invented:
... I was afraid that changes in the class system would have a bad effect on performance. In particular, in order for the data descriptors to work correctly, all manipulations of the object's attributes began with checking the __dict__ class that this attribute is a data descriptor ...

In case users get disappointed with the performance degradation, caring python developers come up with __slots__ .
The presence of __slots__ limits the possible names of attributes of an object to those specified there. Also, since all attribute names are now known in advance, removes the need to create __dict__ instances.

 class Slotter: __slots__ = ["a", "b"] s = Slotter() s.__dict__ # AttributeError sc = 1 # AttributeError sa = 1 sa # 1 sb = 1 sb # 1 dir(s) # [ ... 'a', 'b' ... ] 

It turned out that Guido's fears were not justified, but by the time it became clear, it was already too late. In addition, using __slots__ can actually increase performance, especially by reducing the amount of memory used when creating many small objects.

Conclusion


Attribute access in python can be controlled in a huge number of ways. Each of them solves his problem, and together they fit almost every conceivable scenario of using an object. These mechanisms are the basis of language flexibility, along with multiple inheritance, metaclasses, and other goodies. It took me some time to figure out, understand and, most importantly, accept these many options for the work of attributes. At first glance, it seemed slightly redundant and not particularly logical, but given that it is rarely useful in daily programming, it is nice to have such powerful tools in your arsenal.
I hope that this article also clarified a couple of moments to you that hands did not reach to understand. And now, with fire in the eyes and confidence in the Point, you will write a huge amount of the cleanest, readable and resistant to changes in the requirements of the code! Well, or a comment.

Thank you for your time.

Links

  1. Shalabh Chaturvedi. Python Attributes and Methods
  2. Guido Van Rossum. The Inside Story on New-Style Classes
  3. Python documentation
UPD: Useful link from user leron : Python Data Model

Source: https://habr.com/ru/post/137415/


All Articles