We get rid of duplicate packages in bundles

There are many webpack packages that find duplicates in the bundle, the most popular of them is duplicate-package-checker-webpack-plugin , but it requires rebuilding the project, and since the task was to automate the selection of the optimal version of the packages, it turned out to be an alternative solution.

Well, or my story how to reduce the bundle by 15% in a few seconds.

pain

As in many large companies that have a huge code base, a lot of common logic, as a result we use common components published in your npm repository. They are published via lerna , respectively, before each installation or update of common components, the question arises as to which version to install. lerna re-uses all components that use the published component (if the version was previously the latest). Accordingly, there are always versions of several components that are better suited to each other, since they do not compete with dependencies.

From open source projects in this way, publish nivo , here is their lerna config .

How then duplicate dependencies appear? And how to eliminate them?

Suppose you have a simple project with the following package.json :

 { "name": "demo-project", "version": "1.0.0", "dependencies": { "@nivo/bar": "0.54.0", "@nivo/core": "0.53.0", "@nivo/pie": "0.54.0", "@nivo/stream": "0.54.0" } }

Let's see where @nivo/core :

 npm list @nivo/core

We see 4 copies of @nivo/core (3 copies of 0.54.0 and 1 - 0.53.0 ). But if we change the minor version of @nivo/core to 0.54.0 , the duplicates will be eliminated.

The current example is simple, but in practice, of course, each package has more dependencies, and the subdependencies still need to be considered further, which increases the complexity of the task.

And once again, when I saw the huge size of the bundle, I was tired of manually removing duplicate packages.

In general, it is right to immediately update the packages to the latest version, but there is no time, as always, to change the major versions, and it’s long and difficult to select the appropriate package for the blind one. After all, you need to update the dependency version in package.json , install new dependencies, and then check whether duplicates in the build have disappeared, if not - repeat, for a long time, on average 3-4 minutes per iteration.

All this is monotonous and requires care, so I decided to automate.

I would like to find out duplicates without reinstalling dependencies, and rebuilding the project, ideally cli application outputting optimization options in 10 seconds and all existing duplicates in the project.

Elimination of duplicates can be divided into several subtasks, we consider them in order.

The first task. It is necessary to simulate the future dependency tree of the bundle only by package.json, given the standard dedupe, quickly, in no more than 100ms.

I decided to use package-json to get information on packages and semver to compare different versions.

As a result, the npm package dependencies-tree-builder was obtained, which quickly modeled the bundle dependency tree only by package.json.

Allocated in a separate component, because maybe someone reuse it in combinatorial tasks with package.json.

The second task. A combinatorial problem, an effective search through dependency changes, and a comparison of several tree variants, and of course the choice of the optimal one.

It was necessary to somehow compare the quality of the resulting trees, and we had to borrow the idea of entropy, as a quantitative measure of disorder, took the sum of copies of duplicates (from the example above it is equal to 3).

It would be great to take into account the weight of the packages (in KB), but unfortunately, I did not find a package that would quickly work with weights, and those that are there work about half a minute per package, for example package-size . Since they work according to the following principle: they create a project with a single dependency, install dependencies, then the total weight of the folder is measured. As a result, I did not invent another criterion, as the number of duplicate copies.

To understand which package to change, the reasons for duplicates, the source and effect specifically are considered. Brute force eliminates duplicate effects as much as possible, and since the effects are eliminated, duplicates are subsequently also.

The result was a small cli ostap application that recommends optimal versions to reduce the number of duplicate copies in the bundle.

It is started simply by pointing to the package.json of your project.

 ostap ./package.json

You can also use it to quickly view all future duplicates without rebuilding the project by changing only the versions in package.json.

 ostap ./package.json -s

As a result, in my project, the total weight of bundles decreased by 15%.

The repository has a quick start section.

If you use route-splitting, it may seem that some bundles have increased in weight, but the distribution of components may have changed. That is, instead of copies of dependencies on each page, the only version went into a common bundle for all pages, so you need to estimate the total weight of bundles on all pages.

Hope the article was helpful. And someone will save information. Thank.

Once again, reference for convenience:

Package modeling tree of dependences of bundle by package.json
Github ;
Dependency Optimizer to eliminate duplicate bandl
Github

If you have interesting ideas, write in issue on github, we will discuss.

Source: https://habr.com/ru/post/445878/

All Articles

We get rid of duplicate packages in bundles

More articles: