Maniac Minimization (chasing the byte)

Hello World,

This topic is about how you can pre-code the code so as to improve its minimization. I recently minimized the library before the release.

Helios Kernel (about which he wrote the day before yesterday ). The library source code weighs 28112 bytes, it has generous comments, and therefore it is compressed from the floor with a YUI compressor to 7083 bytes. Not that it seemed to me that 7 kilobytes is too fat. But just by looking at the minimized code with my own eyes, I could see a bunch of places where I could save more:

')
Let's see what can be done with the code to turn 7083 bytes into ~~4009~~ 3937.

But before you begin, two reservations:

We will not use all sorts of dirty tricks (like var a = this or var f = false ) that theoretically lead to a slowdown. It is assumed that the speed is still more important than the file size.
At each step, I ran the code through a set of tests. It often happens that after a change, everything stops working. If you do not test the code during the manual optimization process (or if you don’t have any tests at all), then the code that will end up will most likely not work.

Minimizer selection

In general, this article is not about the comparison of minimizers, but in the process I noticed that the YUI compressor has a ~~bug~~ feature: it does not remove the curly brackets from code blocks consisting of one line. Moreover, it adds braces, even if they were not in the original (in the first picture it is marked with a WTF tag). I took it as rudeness and, without hesitation, switched to using the online minimizer http://jscompress.com/ . However, the rest of the reasoning applies to any minimizer of your choice.

Big Anonymous Function

To begin with, let's wrap all the code in a large anonymous function that will be immediately called (if this was not done initially). Then we can use the local scope of this function. How this will save bytes will be shown below. The most compact way to wrap code in an anonymous function is as follows:

It was	has become
`//`	`!function(){ // }()`

"Private" objects

Surely the code has a large number of auxiliary objects that are not included in the public API. Since there is no native way in Javascript to indicate that an object is private, usually some kind of convention is used. Most often such objects are named starting with an underscore: " _ ". Usually the minimizer replaces the names of local variables with single-letter ones, but leaves the names of “private” objects unchanged, because it does not make bold assumptions about how we designate “private” objects. But it doesn’t matter to us how these objects will be called in the minimized code, so you can rename them manually:

It was has become

It was	has become
`myObject._somethingPrivate = { // ... }`	`myObject.a = { // ... }`
`MyObj = function() { this.somePublicProperty = ...; this._somePrivateProperty = ...; this._anotherPrivateProperty = ...; }`	`MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }`
`MyObj.prototype._privateMethod = function() { // ... }`	`MyObj.prototype.c = function() { // ... }`

 myObject._somethingPrivate = { // ... }

 myObject.a = { // ... }

 MyObj = function() { this.somePublicProperty = ...; this._somePrivateProperty = ...; this._anotherPrivateProperty = ...; }

 MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }

 MyObj.prototype._privateMethod = function() { // ... }

 MyObj.prototype.c = function() { // ... }

Here you need to be careful. First, do not forget to replace the names of private functions and variables not only in declarations, but also where they are used. Secondly, you need to keep track of the logic of the code, and avoid intersections of names. For example, if a type has a function a already declared in the prototype, you cannot call the private property of this object by the same name. This is an obvious thing, but it is easy to miss it if you do not pay special attention to it.

In addition, private objects are often declared not only in all constructors / initializers. Javascript allows you to add objects on the fly. In theory, all private identifiers in the code can be carefully replaced with single-letter ones:

It was has become

It was	has become
`MyObj.prototype.getSomething = function() { if ( typeof this._prop == "undefined" ) { this._prop = 0; } return this._prop; }`	`MyObj.prototype.getSomething = function() { if ( typeof this.x == "undefined" ) { this.x = 0; } return this.x; }`

 MyObj.prototype.getSomething = function() { if ( typeof this._prop == "undefined" ) { this._prop = 0; } return this._prop; }

 MyObj.prototype.getSomething = function() { if ( typeof this.x == "undefined" ) { this.x = 0; } return this.x; }

"Public" objects

“Public” objects are those that are part of the API, and we need them to be called exactly as they were originally called. But if a “public” object is used too often inside the code (well, let's say, at least once), and its name is too long (well, let's say, more than two bytes), then it makes sense to give it an alias:

It was	has become
`myObject = { ... }`	`var a = myObject = { ... }`

In this example, after such a change, the variable a will be declared as local, and the variable myObject as global (assuming that the identifier myObject is used for the first time.

Now you can run through the code, find all the objects that are not only declared, but also used, and make it an alias:

It was has become

It was	has become
`MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }`	`var b = MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }`
`MyObj.prototype.someMethod = function() { // ... }`	`b.prototype.d = b.prototype.someMethod = function() { // ... }`
`someStorage.someMethod = function() { // ... }`	`var c = someStorage.someMethod = function() { // ... }`

 MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }

 var b = MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }

 MyObj.prototype.someMethod = function() { // ... }

 b.prototype.d = b.prototype.someMethod = function() { // ... }

 someStorage.someMethod = function() { // ... }

 var c = someStorage.someMethod = function() { // ... }

And again, the main thing is not to get confused in scopes and not to name variables from the same scope with the same name. In the examples above, an object of type MyObj already has a private property b and a private method c , and the new local variables b and c fall into the scope of the Big Anonymous Function, into which we wrapped all the code at the very beginning (we wrapped it, didn't we? ?)

In addition, we can make aliases to some public properties, but only to those that contain complex objects:

It was has become

It was	has become
`AnotherObj = function() { this.someProperty = [ 0, 0, 0 ]; // this.secondProperty = { a: 1 }; // this.thirdProperty = 0; // this.fourthProperty = true; // - this.fifthProperty = "hello"; // }`	`AnotherObj = function() { this.a = this.someProperty = [ 0, 0, 0 ]; this.b = this.secondProperty = { a: 1 }; this.thirdProperty = 0; this.fourthProperty = true; this.fifthProperty = "hello"; }`

 AnotherObj = function() { this.someProperty = [ 0, 0, 0 ]; //  this.secondProperty = { a: 1 }; //  this.thirdProperty = 0; //  this.fourthProperty = true; // - this.fifthProperty = "hello"; //  }

 AnotherObj = function() { this.a = this.someProperty = [ 0, 0, 0 ]; this.b = this.secondProperty = { a: 1 }; this.thirdProperty = 0; this.fourthProperty = true; this.fifthProperty = "hello"; }

If we make aliases for simple objects, it will copy the contents, and the alias will point to another object.

Putting var 's

Now we use the fact that you can declare variables separated by commas, using the word var once. In the simplest case, it looks like this:

It was has become

It was	has become
`someFunction = function() { var a = 0; var b = something(); // ... }`	`someFunction = function() { var a = 0, b = something(); // ... }`
`anotherFunction = function() { var c; // - var d = something(); // - for ( var i = 0; i < ... // - }`	`anotherFunction = function() { var c, d = something(), i = 0 // - // - for ( ; i < ... // - }`

 someFunction = function() { var a = 0; var b = something(); // ... }

 someFunction = function() { var a = 0, b = something(); // ... }

 anotherFunction = function() { var c; // -  var d = something(); //  -  for ( var i = 0; i < ... //   -  }

 anotherFunction = function() { var c, d = something(), i = 0 // -  //  -  for ( ; i < ... //   -  }

In general, you need to pull out all the declarations to the beginning of the function and write them using one var . I will write about optimization of the for () loop below. And still it is necessary to collect all local announcements inside our Large Hadron Function and also to shove them under one var at the beginning. These are the aliases we created in the previous section. All code should transform like this:

It was has become

It was	has become
`!function(){ // - var b = MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; } // - var c = b.prototype.someMethod = function() { // ... } // - }()`	`!function(){ var b = MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }, c = b.prototype.someMethod = function() { // ... }, // // - // - // - }()`

 !function(){ // -  var b = MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; } //  -  var c = b.prototype.someMethod = function() { // ... } //   -  }()

 !function(){ var b = MyObj = function() { this.somePublicProperty = ...; this.a = ...; this.b = ...; }, c = b.prototype.someMethod = function() { // ... }, //      // -  //  -  //   -  }()

Note that in this example, the variables b , c and the like remain declared as local to the Big Function. Thus, we will save as many vars as there were in the function (well, except for one).

And you need to make sure that the code logic does not change. We after all change the order of lines, therefore theoretically it can happen so that some object will be used before it is initialized, this cannot be allowed.

Prototypes

For each declared type and its constructor, it is possible to save a lot on the word protoype - it is too long. To do this, we describe the entire prototype for future objects of this type in the form of a single hash:

It was has become

It was	has become
`MyObj = function() { // ... } MyObj.prototype.someMethod = function() { // ... } MyObj.prototype.anotherMethod = function() { // ... } MyObj.prototype.thirdMethod = function() { // ... }`	`MyObj = function() { // ... } MyObj.prototype = { someMethod : function() { // ... }, anotherMethod : function() { // ... }, thirdMethod : function() { // ... } }`

 MyObj = function() { // ... } MyObj.prototype.someMethod = function() { // ... } MyObj.prototype.anotherMethod = function() { // ... } MyObj.prototype.thirdMethod = function() { // ... }

 MyObj = function() { // ... } MyObj.prototype = { someMethod : function() { // ... }, anotherMethod : function() { // ... }, thirdMethod : function() { // ... } }

As you can see, for this you need to remember to replace the " = " with " : " and separate the method declarations with commas. This method does not work for the case when you need to add some prototype for a type constructor declared somewhere else (because we completely redefine the prototype with such a record).

Cycle and condition optimization

Almost all cycles and many conditions can be optimized:

It was has become

It was	has become
`a--; if ( a == 0 ) { // ... }`	`if ( --a == 0 ) { // ... }`
`if ( --a == 0 ) { // ... }`	`if ( !--a ) { // ... }`
`for ( var i = 0; i < a; i++ ) { b( c[ i ] ); }`	`for ( var i = 0; i < a; ) { b( c[ i++ ] ); }`

 a--; if ( a == 0 ) { // ... }

 if ( --a == 0 ) { // ... }

 if ( --a == 0 ) { // ... }

 if ( !--a ) { // ... }

 for ( var i = 0; i < a; i++ ) { b( c[ i ] ); }

 for ( var i = 0; i < a; ) { b( c[ i++ ] ); }

But here, too, you need to be careful not to violate the logic of the code.

Commonly used values

It happens that there are values that are used more than once. They can also be put into variables:

It was has become

It was	has become
`// ... if ( typeof a == "undefined" ) ... // ... if ( typeof b == "undefined" ) ... // ...`	`var z = "undefined"; // ... if ( typeof a == z ) ... // ... if ( typeof b == z ) ... // ...`
`if ( typeof a != "function" ) { a = function(){} } // ... if ( typeof b != "function" ) { b = function(){} }`	`var f = "function", g = function(){} // ... if ( typeof a != f ) { a = g; } // ... if ( typeof b != f ) { b = g; }`
`el = document.createElement( "script" ); el.type = "text/javascript";`	`var x = "script"; el = document.createElement( x ); el.type = "text/java" + x;`

 // ... if ( typeof a == "undefined" ) ... // ... if ( typeof b == "undefined" ) ... // ...

 var z = "undefined"; // ... if ( typeof a == z ) ... // ... if ( typeof b == z ) ... // ...

 if ( typeof a != "function" ) { a = function(){} } // ... if ( typeof b != "function" ) { b = function(){} }

 var f = "function", g = function(){} // ... if ( typeof a != f ) { a = g; } // ... if ( typeof b != f ) { b = g; }

 el = document.createElement( "script" ); el.type = "text/javascript";

 var x = "script"; el = document.createElement( x ); el.type = "text/java" + x;

Throw away all unnecessary

It often happens that the code contains extra information “for clarity”, which can be eliminated. But here, as elsewhere, you need to carefully monitor what we delete:

It was has become

It was	has become
`if ( a.length > 0 ) { b = a.pop() }`	`if ( a.length ) { b = a.pop() }`
`var someEnum = { foo : 0, bar : 1, buz : 2 } // ... var a = []; for ( var i in someEnum ) { a[ someEnum[ i ] ] = 0; } // ... a[ someEnum.bar ] = getSomething(); // ... if ( c.state == someEnum.foo ) { // ... }`	`var someEnum = { foo : 0, bar : 1, buz : 2 } // ... var a = [ 0, 0, 0 ]; // ... a[ 1 ] = getSomething(); // ... if ( !c.state ) { // ... }`

 if ( a.length > 0 ) { b = a.pop() }

 if ( a.length ) { b = a.pop() }

 var someEnum = { foo : 0, bar : 1, buz : 2 } // ... var a = []; for ( var i in someEnum ) { a[ someEnum[ i ] ] = 0; } // ... a[ someEnum.bar ] = getSomething(); // ... if ( c.state == someEnum.foo ) { // ... }

 var someEnum = { foo : 0, bar : 1, buz : 2 } // ... var a = [ 0, 0, 0 ]; // ... a[ 1 ] = getSomething(); // ... if ( !c.state ) { // ... }

Bonus: remove var 's

This is an interesting trick that is useful in cases where one local variable is declared inside a function (or if the variable is declared without initialization). Here we save on one var 'e, but we have to duplicate the name of a variable:

It was has become

It was	has become
`doSomething = function( param1, param2 ) { var i = 0; // .... }`	`doSomething = function( param1, param2, i ) { i = 0; // .... }`
`doSomething = function( param1, param2 ) { var a, b, c; // .... }`	`doSomething = function( param1, param2, a, b, c ) { // .... }`

 doSomething = function( param1, param2 ) { var i = 0; // .... }

 doSomething = function( param1, param2, i ) { i = 0; // .... }

 doSomething = function( param1, param2 ) { var a, b, c; // .... }

 doSomething = function( param1, param2, a, b, c ) { // .... }

Here we use parameters instead of local variables, but they behave the same way. This trick is not suitable in cases where the function takes an unknown number of parameters. Most often, it allows to get rid of almost all var 's in the code.

What happened in the end

After processing the code in the ways described, I fed the script to the jscompress.com service. A little thought, he gave me just such a porridge for 4009 bytes. Enjoy your meal!

By the way, I will distribute the advantages in karma to those who find and describe in the comments what else can be cut in this mess :-)

Update

nano_freelancer suggested some good ideas:

replace all true and false with 1 and 0 respectively
```
 for (initial;condition;loop statement) {statements} 
```
you can put a comma after the loop statement and place all statements from statements separated by commas (instead of a semicolon) - save 2 bytes (braces). But this is only applicable when the statement itself does not contain complex statements.

In addition, most nulls can also be replaced by 0 (but not all).

Code size reduced to 3937 bytes :-)

Offtopic: the source and minimized codes with which I worked are available for download on the project’s home page: http://home.gna.org/helios/kernel/

Source: https://habr.com/ru/post/127672/

All Articles

Maniac Minimization (chasing the byte)

Minimizer selection

Big Anonymous Function

"Private" objects

"Public" objects

Putting var 's

Prototypes

Cycle and condition optimization

Commonly used values

Throw away all unnecessary

Bonus: remove var 's

What happened in the end

Update

More articles: