Why tree shaking does not work and how to live with it
In our previous article about voice bots for Roketbank, habrauys were outraged that in 2017 JavaScript examples for the Voximplant cloud are written in ES5. We have a heavily modified SpiderMonkey in the cloud, specially trained not to flow or fall. Thousands of simultaneous calls with concurrently executed JavaScript seem to hint that the node is not an option for us. However, no one bothers to use transpilers, compile ES2017 / TypeScript / Elm / Whatever into plain old JavaScript and load compilation results using Continuous Integration . In this situation, there is a temptation to use all the latest achievements from npmjs , collecting all the code in one ES5 bundle. And here we are waiting for an ambush: even one method from lodash gives a half megabyte size bundle at the output. And it doesn't look like the tree shaking advertised has been working for the last couple of years.
Who is shaking the trees?
Huge javascript jigsaws are not very good. In the world of browsers, they increase the page load time: first, you need to download such a bundle, then parse, then execute. In the world of backend and scripts-in-the-cloud, too, its nuances. To perform thousands of calls per second controlled via JavaScript, our SpiderMonkey limits the JavaScript session memory to 16 megabytes. That's all: source code, ast, data structures. The architecture of our platform implies that code is running in the cloud, which should work in real time. And all the "heavy" things can be transferred to your backend and make HTTP requests to it right during the call. The trouble is that a couple of methods from lodash look like a great idea for lightweight code in the cloud. Hop - plus half a megabyte to the resulting javascript.
The JavaScript developer community has been aware of this problem for a long time, and except for “let's throw out all the spaces and rename it not scary to single letter variants” (uglify before dead code elimination) is actively developing “Tree Shaking”. Ideally, “Tree Shaking” should remove all unused code: imports, method calls, global variables. And for our code, we need to get a few lodash functions, our code, and that's it. Instead, we get the whole lodash. WTF?
Webpack, Rollup and Uglify
Tree shaking support is considered a strength of rollup, announced in recent webpack versions, and has been present in UglifyJS for a long time as “dead code elimination”. Two years ago, the author of Rollup wrote very well about the difference between dead code elimination: if the dead code elimination receives a compiled bandle as an input and tries to throw out unused code from it, then tree shaking works with AST code during compilation and tries to include only that code which is used. By the way, Webpack is designed for a combined approach: first tree shaking during the build of the bundle, and then the dead code elimination using the UglifyJS plugin. ')
Only in the real world Tree Shaking does not work.
According to the authors themselves, to determine the code used in a weakly typed language is not a trivial task. And in order to not break anything, in incomprehensible situations the code is always included. Unfortunately, examples of such “incomprehensible” situations are the most popular general purpose libraries: lodash, underscore — all of these guys.
What to do?
You can, of course, wait a couple more years. Type inference is getting better, work is underway to support tree shaking for typed dialects like TypeScript. But I want to write ES2017 code with libraries now. Without multi-megabyte bundles.
The community knows about this problem, so a temporary solution is being actively used: large monsters like lodash are broken into a huge pile of small modules that can be imported separately. And here already tree shaking failures does not:
Of course, this is a head-on demonstration with empty webpack / rollup configs. You can screw it up to more impressive numbers, but the basic idea is that you shouldn’t be upset about the thousands of dependencies that yarn puts. Minimally doped with a file, the stack allows you to throw out most of the unused code and get a completely readable bundle for loading into Voximplant or any other platform that is programmed in JavaScript.