Hello! I had the following task: I had to parse a bunch of data and organize it into classes, and later load it into a database. It seems to be nothing complicated, but on this day I even forgot to eat, and why - look under the cat, because I sdelal.

There was, of course, a lot of data, but it didn’t complicate the task, complicated the fact that the same element could be found in different parts of the site. This data can be compared with accounts in social networks. One and the same account can leave its mark everywhere - and put likes on different pages, write comments everywhere, and hang something on the wall for different people. And we need all this to be the same object in our program and not to duplicate it in any way. It seems that everything is simple, check to yourself whether this element has already been found - that's all. But it is ugly, it is not true. Yes, and contrary to the philosophy of Python. I wanted a beautiful solution, something that would simply prohibit the creation of an element that already exists or simply does not create it, would ignore all initialization, and the inner constructor returns an already existing element.
I will give an example. I have, for example, an entity.
')
class Animal: def __init__(self, id): self.id=id
And each such entity has its own unique id.
As a result, finding two identical entities in different places, we create 2 absolutely identical objects. The first thing to do is add some sort of object storage:
class Animal: __cache__=dict() def __init__(self, id): self.id=id
A new object in python is created in the function of the
__new__ class, this function should return the newly created object, and it is in it that we need to dig in to override the element creation behavior.
class Animal: __cache__=dict() def __new__(cls, id): if not id in Animal.__cache__: Animal.__cache__[id]=super().__new__(cls) return Animal.__cache__[id] def __init__(self, id): self.id=id
Here, it seems, and everything, the problem is solved. I thought the first 20 minutes. When expanding the program and increasing the classes, I began to receive an error like:
__init __ () required N positional argumentThe problem forced me to go to google with the search for what, maybe, I did absolutely everything against the rules. It turned out,
yes . They tell me not to get into the __new__ method without need, and the alternative is offered to the Factory pattern.
In short, the Factory pattern is that we allocate a place that controls the creation of objects. For Python, they offered this
example. class Factory: def register(self, methodName, constructor, *args, **kargs): """register a constructor""" _args = [constructor] _args.extend(args) setattr(self, methodName,apply(Functor,_args, kargs)) def unregister(self, methodName): """unregister a constructor""" delattr(self, methodName) class Functor: def __init__(self, function, *args, **kargs): assert callable(function), "function should be a callable obj" self._function = function self._args = args self._kargs = kargs def __call__(self, *args, **kargs): """call function""" _args = list(self._args) _args.extend(args) _kargs = self._kargs.copy() _kargs.update(kargs) return apply(self._function,_args,_kargs)
We are allowed to create objects only using the methods of the class Factory Given that we can absolutely not use it and create objects directly. In general, this solution may be correct, but I did not like it, so I decided to look for a solution in my own code.
A little study of the creation process gave me the answer. Creating an object (in brief) is as follows: first, the __new__ method is called, into which the class and all the arguments of the constructor are passed, this method creates an object and returns it. Later, the __init__ method of the class to which the object belongs is called.
Abstracted code:
def __new__(cls, id, b, k, zz): return super().__new__(cls) def __init__(self, id, b, k, zz):
The problem came out with the next action. For example, I add the Cat class
class Cat(Animal): data="data" def __init__(self, id, b, k, zz, variable, one_more_variable):
As you can see, the constructors of the classes are different. Imagine that we have already created an Animal with id = 1. Later we create a Cat element with id = 1.
An object of class Animal with id = 1 already exists, so according to the logic of things, an object of class Cat should not be created. In general, he does not do this, but ends the error so that a different number of arguments are passed to __init__.
As you understand, he tries to create an element of the Cat class, but later calls the constructor of the class Animal. Not only does it call the wrong constructor, the very bad result is that even if we created Animal again with id = 1, the constructor for the same object was called again. And, perhaps, would overwrite all the data and do unwanted actions.
Not good. It also makes sense to retreat and create a factory for the production of objects.
But we write in Python, the most flexible and beautiful language, why we have to make concessions.
As it turned out, the solution is:
class Animal: __cache__=dict() __tmp__=None def __fake_init__(self, *args, **kwargs): self.__class__.__init__=Animal.__tmp__ Animal.__tmp__=None def __new__(cls, id): if not id in Animal.__cache__: Animal.__cache__[id]=super().__new__(cls) else: Animal.__tmp__=Animal.__cache__[id].__class__.__init__ Animal.__cache__[id].__class__.__init__=Animal.__fake_init__ return Animal.__cache__[id] def __init__(self, id): self.id=id
It was impossible to disable the constructor call, after performing __new__, the function call __init__ from the class of the object created (or not, as in our case) of the object went without question. There was only one way out - to replace __init__ in the class of the created object. In order not to lose the class constructor, I saved it in some variable and later, instead of it, I slipped the fake constructor, which was then called when the object was “created”. But the fake constructor is not empty, it is precisely what the old designer returns to his place.
I’ll say finally that maybe I’m completely wrong, I realized in absentia that my code contradicts warnings, even in the official Python developer communities they say that you can only touch __new__ when inheriting from iterative types, such as lists, tuples, etc. But, as it seems to me, sometimes it is worth going beyond the bounds of decency only so that later you can calmly write.
an1=Animal(1) an2=Animal(1) cat1=Cat(1)
and do not worry about problems.
Thanks for attention!