How do closures (under the hood) in JavaScript

Hi, Habr!

We in Hexlet use JavaScript not only for obvious tasks in the frontend, but also, for example, to implement a browser-based development environment (our open-source hexlet-ide ) on React. We have a hands-on course on JavaScript , and one of the lessons there is devoted to closures. This is an important topic not so much in the framework of JS, as in programming in general. We cover it in other courses .

In general, there are many articles and tutorials about the use of closures in JS, but there are few explanations of how everything works inside. Today's translation is dedicated to this particular topic. How and why closures work in JS, when they are created and destroyed, and why each function in JS is closure.
')
I have been using closures for quite some time. I learned how to use them, but did not fully understand how they actually work, what is happening “under the hood”. What is it all about? Wikipedia doesn't really help. When is a closure created and destroyed? What does the implementation look like?

"use strict"; var myClosure = (function outerFunction() { var hidden = 1; return { inc: function innerFunction() { return hidden++; } }; }()); myClosure.inc(); //  1 myClosure.inc(); //  2 myClosure.inc(); //  3 // , .    ? //     ?

When I finally found out, I wanted to share with everyone. At least, so I will not forget. After all

Tell me - and I will forget, show me - and I will remember, let me do it - and I will understand.

- Confucius and Benjamin Franklin

In the process of learning, I tried to visualize the interaction of entities: how objects refer to each other, how one is inherited from another, and so on. I could not find the illustrations, so I drew my own.

I assume that the reader is familiar with JavaScript, knows about the global object, knows that the functions in JS are functions of a higher order, etc.

Chain of sight

When the JS code works, it needs space to store local variables. Let's call this space an object of scope (aka LexicalEnvironment — the lexical environment) or simply a scope-object. For example, when you call a function, and it sets a local variable, this variable is stored in the scope object. It can be considered a regular JavaScript object, with one important difference: it cannot be directly accessed. You can change its properties, but you cannot access the object itself.

The concept of such an object for scope is very different from, say, C or C ++, where local variables are stored on the stack. In JavaScript, such objects are stored in heap, and they can remain in memory even after the function returns a value. We will talk about this later.

As you might expect, the scope object may have a parent. When code tries to access a variable, the interpreter searches for a property on the current scope object. If the property does not exist, the interpreter moves up the chain of scope objects and continues searching. And so on, until the property is found or until the parents run out. Let's call this scope scope order a “scope chain” or a “scope chain”.

This mechanism is very similar to prototype inheritance, but, again, there is one important difference: if you try to access the non-existing property of a regular object, and there is no such property anywhere in the prototype chain, it will simply be returned undefined. But if you refer to a non-existent property in the scop chain (that is, refer to a variable that does not exist), there will be a ReferenceError error.

The last element in the scope chain is always a global object (Global Object). In the highest level JavaScript code, a chain of scope objects consists of just one element: a global object. So when you create variables in the top level of the code, they are set in the global object. When a function call occurs, there is more than one object in the scope chain. You might think that if the function is called from the top level, then there are exactly two objects in the scope chain, but this is not always the case. There may be 2 or more objects, it depends on the function. About this, too, later.

Upper level

Enough theory, here is an example:

my_script.js

 "use strict"; var foo = 1; var bar = 2;

We simply created two variables at the top level. As I explained above, in this case, the scope object is a global object:

Here we have a launch area (this is my top level code from my_script.js), and the corresponding scope object. Of course, in reality, the global object also contains a bunch of standard and host-specific pieces, but we will not show them here.

Non-nested functions

Take a look at this script:

my_script.js

 "use strict"; var foo = 1; var bar = 2; function myFunc() { //--  ,    var a = 1; var b = 2; var foo = 3; console.log("inside myFunc"); } console.log("outside"); //--    myFunc();

When the function myFunc is defined, the identifier myFunc is added to the current scope-object (in this case, to the global object), and this identifier refers to the function. As you know, a function is an object, so further when we say “function-object”, we mean an object, which is a function.

The function object contains the function code and other properties. One of the properties we are interested in is the internal property [[scope]]; it refers to the current scope object, that is, the scope object that is active at the time the function is defined (again, in this case, it is a global object).

When we call console.log (“outside”), we get the following scheme:

The object function referenced by the variable myFunc stores the function code and refers to the scope object that was relevant at the time of the function definition. It is very important.

When a function is called, a new scope-object is created that stores local variables for myFunc (and the values of its arguments), and this new scope-object inherits from the scope-object that the function being called refers to.

So, when calling myFunc, the scheme looks like this:

This is a chain of scope objects. If you refer to any variable inside myFunc, JavaScript will try to find it in the first object of the chain — the scope of the myFunc () function. If there is no such variable, then you need to go higher (in this case, there is a global object). If it fails to find anything in the whole chain, there will be a Reference Error error.

For example, if we call a inside myFunc, then we get 1 from the first object, the scope object myFunc (). If we turn to foo, we get 3 from the same object: we can say it hides the foo property of the global object. If we turn to bar, we get 2 of the global object. This works almost like prototype inheritance.

It is important to remember that these scope objects continue to exist as long as they are referenced. When the last link to such an object disappears, the object will be processed by the garbage collector.

After myFunc () returns a value, there are no more references to the scope of myFunc (), the garbage collector does its job and it turns out:

Further I will not include the function objects in the diagrams in order not to overload the illustrations. As you already know, the chain looks like this: function → function-object → scope-object.

Do not forget about it.

Nested functions

From the moment the function returns a value, no one else refers to its scope object, so the garbage collector collects it. But what if you define a nested function and return it (or save somewhere outside the current scope-object) you already know the answer: the function-object always refers to the scope-object in which it was created. So when we set a nested function, it gets a link to the current scope of the external function. And if we save the nested function elsewhere, the scope object will not be processed by the garbage collector even when the external function returns a value: there is still a link to this scope object! Take a look at this code:

my_script.js

 "use strict"; function createCounter(initial) { //-- ,    var counter = initial; //--  .    //    scope- (  ) /** *      . *       1 —  1. */ function increment(value) { if (!isFinite(value) || value < 1){ value = 1; } counter += value; } /** *    . */ function get() { return counter; } //--  ,   //    return { increment: increment, get: get }; } //--    var myCounter = createCounter(100); console.log(myCounter.get()); //--  "100" myCounter.increment(5); console.log(myCounter.get()); //--  "105"

When calling createCounter (100); It turns out such a scheme:

Note that the createCounter (100) scope has references from the nested functions increment and get. If createCounter () does not return anything, then of course these internal references to themselves will not be considered, and the scope object will be collected by the garbage collector. But since createCounter () returns an object that contains references to these functions, it turns out like this:

So, the createCounter (100) function has already returned a value, but its scope still exists, it is accessible from internal functions and only from them. There is no way to access the createCounter (100) scope directly, you can only call myCounter.increment () or myCounter.get (). These functions have unique, private access to the createCounter area.

Let's try calling myCounter.get (). Remember - when calling a function, a new scope is created, and a new object is added to the scope chain that is used for this new function. It turns out like this:

The first scope-object in the get () function chain is the empty scope-object of the function itself. When inside a get () a call is accessed, JavaScript cannot find anything in the first chain object, moves to the next object and uses the createCounter (100) in-scope counter. And the get () function simply returns it.

You may notice that the myCounter object is also available to the myCounter.get () function as 'this' (the red arrow in the diagram). This is not part of a chain of scope objects, but you need to remember about it. About this, too, later.

Calling increment (5) is a bit more interesting, because the argument is here:

The value of the argument is stored in the scope-object created for this call. When a function accesses the value of a variable, JavaScript immediately finds it in the first object in the chain. However, when the function accesses the counter, JavaScript cannot find it in the first object of the chain of scope-objects, moves higher and finds it there. So increment () changes the value in the createCounter (100) scope. And practically nothing else can change this value. Therefore, closures are so important: the myCounter object cannot be opened. Closures are well suited for storing sensitive information.

It is important to understand that scopes are “living”. When a function is called, the current chain is not copied to the function, but in fact is supplemented with a new object. And when any object of the chain changes, this change immediately becomes available to all functions in which this object consists of chains. After increment () changes the value of the counter, the next get () call will return the updated value.

Therefore, this famous example does not work:

 "use strict"; var elems = document.getElementsByClassName("myClass"), i; for (i = 0; i < elems.length; i++) { elems[i].addEventListener("click", function () { this.innerHTML = i; }); }

Several functions are created in a loop, and all of them contain a reference to the same scope object. Therefore, they use the same variable i, not a personal copy. You can read more about this example at the link Don't make functions within a loop .

Similar object functions, different scope objects

And now let's expand our example a little and have a good time ( yes, I have fun - approx. Lane ). What if you create multiple counter objects?

my_script.js

 "use strict"; function createCounter(initial) { /* ... .     ... */ } //--   var myCounter1 = createCounter(100); var myCounter2 = createCounter(200);

After creating myCounter1 and myCounter2, we get the following scheme:

Don't forget: each object function contains a reference to a scope object. In this example, myCounter1.increment and myCounter2.increment refer to function objects that contain the same code and the same property values (name, length, and others ), but their [scope] refers to different scope- objects .

There are no separate function objects in the diagram (to simplify visualization), but they still exist.

Examples:

 var a, b; a = myCounter1.get(); // a == 100 b = myCounter2.get(); // b == 200 myCounter1.increment(1); myCounter1.increment(2); myCounter2.increment(5); a = myCounter1.get(); // a == 103 b = myCounter2.get(); // b == 205

This is how it works. The concept of closures is power.

Chain of scope objects and this

Like it or not, this is not part of a chain of scope objects. The value of this depends on the function call pattern. That is, you can call the same function, but have different values for this inside.

Call patterns

On this topic is to write a separate article, so now I just go over the topic superficially. There are four patterns. Here is:

Method invocation pattern

 "use strict"; var myObj = { myProp: 100, myFunc: function myFunc() { return this.myProp; } }; myObj.myFunc(); //--  100

If the call contains a dot or [subscript], then the function is called as a method. In the example above, this refers to myObj.

Function invocation pattern (function call)

 "use strict"; function myFunc() { return this; } myFunc(); //--  undefined

In this case, the value of this depends on whether the code is running in strict mode.

In strict mode, this is undefined
In non-strict mode, this refers to the global object (Global Object).

In the example above - strict mode, so myFunc () will return undefined.

Constructor invocation pattern (constructor call)

 "use strict"; function MyObj() { this.a = 'a'; this.b = 'b'; } var myObj = new MyObj();

When the function is called with the prefix new, JavaScript sets a new object that inherits from the prototype property of the specified function. And this newly created object is passed to the function as this.

Apply invocation pattern

 "use strict"; function myFunc(myArg) { return this.myProp + " " + myArg; } var result = myFunc.apply( { myProp: "prop" }, [ "arg" ] ); //--  — "prop arg"

You can pass any value like this. In this example, Function.prototype.apply () is used for this. Other options:

Function.prototype.call ()
Function.prototype.bind ()

The following examples mainly use the Method invocation pattern.

Using this in nested functions

 "use strict"; var myObj = { myProp: "outer-value", createInnerObj: function createInnerObj() { var hidden = "value-in-closure"; return { myProp: "inner-value", innerFunc: function innerFunc() { return "hidden: '" + hidden + "', myProp: '" + this.myProp + "'"; } }; } }; var myInnerObj = myObj.createInnerObj(); console.log( myInnerObj.innerFunc() );

Output: hidden: 'value-in-closure', myProp: 'inner-value'

By the time myObj.createInnerObj () is called, the following structure is obtained:

And at the time of the call, myInnerObj.innerFunc () is:

You can see that this in myObj.createInnerObj () refers to myObj, but this in myInnerObj.innerFunc () refers to myInnerObj: both functions are called as methods. Therefore, this.myProp inside innerFunc () returns an internal value, not an external one.

You can trick innerFunc () to use myProp like this:

 var myInnerObj = myObj.createInnerObj(); var fakeObject = { myProp: "fake-inner-value", innerFunc: myInnerObj.innerFunc }; console.log( fakeObject.innerFunc() );

Output: hidden: 'value-in-closure', myProp: 'fake-inner-value'

Or with apply () or call ():

 var myInnerObj = myObj.createInnerObj(); console.log( myInnerObj.innerFunc.call( { myProp: "fake-inner-value-2", } ) );

Output: hidden: 'value-in-closure', myProp: 'fake-inner-value-2'

However, sometimes the inner function actually needs access to this, which is available in the outer function, regardless of how the inner function is called. To do this, you need to specifically save the desired value in the closure (that is, in the current scope-object) like this: var self = this; and use self in the inner function instead of this.

 "use strict"; var myObj = { myProp: "outer-value", createInnerObj: function createInnerObj() { var self = this; var hidden = "value-in-closure"; return { myProp: "inner-value", innerFunc: function innerFunc() { return "hidden: '" + hidden + "', myProp: '" + self.myProp + "'"; } }; } }; var myInnerObj = myObj.createInnerObj(); console.log( myInnerObj.innerFunc() );

Output: hidden: 'value-in-closure', myProp: 'outer-value'

It turns out like this:

Now it is clear that innerFunc () has access to the value of this external function, through self, which lies in the closure.

Conclusion

Now we can answer those questions from the first paragraph.

What is a closure? This is an object associated with both the object function and the scope object. In fact, all functions in JavaScript are closures: it is impossible to have a reference to a function object without a scope object.

When is a closure created? Since all functions in JavaScript are closures, the answer is obvious: when a function is specified, a closure is specified. So the closure is created when the function is defined. But you need to understand the difference between creating a closure and creating a new scope object: closure (a function + link to the current scope-chain functions.

When is a closure destroyed? Like any other object in JavaScript, the garbage collector handles a closure when there are no more references to it.

Source: https://habr.com/ru/post/266443/

All Articles