📜 ⬆️ ⬇️

JavaScript nomenclature (in the context of Node.js and Web API)

I. Prehistory


I have used UltraEdit as an editor for many different occasions for many years. One of the main reasons is fast work with gigabyte files without loading them into memory. It is also quite convenient for programming in JavaScript, with just one major drawback: its auto-completion is based on a rather poor, hard-coded list of keywords and global variables, in addition, lagging behind the development of the language. Somehow I wondered if it was possible to replenish this list with a complete list of all the ready-made properties and methods that can be entered in the context of Node.js and the Web API (browser). Where would such a list be available? I came up with such options:


  1. The ready list, someone made and updated for general use, like globals library, but more complete.


  2. Parsing documentation (ECMAScript specification, MDN and Node.js sites, etc.), manually or programmatically.


  3. Getting a list by metaprogramming.

The main answer to my questions was the suggestion to change the editor and not to suffer. But since there are not so many convenient editors for large files, and my minimalism made it difficult to use several for different needs, and programming was not my main occupation, I did not give up. In the end, it became for me not only a practical task, but also a fundamental interest: how could it be like such a simple need, but there are no easy solutions.


Since I did not find ready lists, and parsing the documentation is a long and unreliable path, I decided to try the third method.


Ii. Code


Using several tips, I wrote a small script that displays the main part of the language nomenclature in different contexts in several ways.


Here's what I got.


The script can be run in Node.js or in the browser (via the console or pasting into the page). In the first case, the result will be output to files, in the second, to text fields added to the current document (you can open about:blank along with the console).


I will try to comment on the code.


1. First, we create the main container variables. In the first two, we will accumulate our nomenclature: a simple list of all lexemes will be stored in nomenclatureChains , the same tokens in nomenclatureChains , but with full chains, starting from root objects. In globs we will store our starting points for unwinding a coil and building a tree - global (root) objects. To avoid infinite recursion due to circular references, we will add all processed objects to processedObjects for later verification.


2. In the second stage, we fill in globs .


First, the script tries to determine in which context it is executed. If it is a browser, the window object is enough for us.


If it's Node.js, it's a bit more complicated. First, we add two main global objects, as well as require , since otherwise we will not leave this function. Then we add the objects of all standard libraries: the main part - starting from the undocumented list of require('repl')._builtinLibs , advised by one of the Node.js developers, and then several missing modules. __dirname , since several intra- __dirname variables ( __dirname and __filename ) are not tied to any global object, we will immediately add them to our nomenclature containers.


3. The main work follows: with the help of the recursive function processKeys we processKeys all global objects and all objects stored in their properties to the last possible depth. Then we output the results depending on the context and end them with the final output in the console of the sizes of our items (the script runs for a considerable time, so this output can serve as a shutdown signal - although Chrome may require additional time to refresh the page even after this signal).


4. The processKeys function is the main engine of the process.


First we check if we are dealing with the root object. If yes, we immediately enter his name in the nomenclature. If the object is located in the child property of the object, this entry has already occurred at the previous recursion stage, so we skip it.


Then we add the object to the list of processed objects in order not to fall into bad infinity.


After that, we begin to bypass all the properties of the object. To do this, we use the Reflect.ownKeys() method, since only it lists both ordinary string keys of the object and keys of the type Symbol. We enter each of the properties in nomenclatureTerms (the Set type automatically discards repetitions), then we form a chain of the name of the parent object and the current property and put it in the nomenclatureChains ; the same string will become the name of the object for the next recursive call, so it will constantly grow as you go deep (I chose the notation with square brackets for all cases to unify sorting: if you use a dot for regular identifiers and a bracket for complex strings, it breaks the order in the output the list; JSON.stringify used for reinsurance - for escaping possible quotes as part of property names). Keys of Symbol type are brought to strings before entering into the database (unfortunately, this makes base elements with such keys in property chains unsuitable for direct interpretation, for example, in REPL Node.js or in browser consoles - before that you need to bring such keys to Symbol, removing the excess from the string representation).


In the next step, we check what is stored in the property: if it is an object, we make a new recursive call if this object is not yet in the list of processed ones. The objectness check is double, because instanceof Object returns false for Object.prototype and for objects created with Object.create(null) .


Such a ubiquitous passage through the properties often causes errors, so we will have to add a handler so that the process is not interrupted (error messages are left for the sake of curiosity). Also in the console, in addition to our desire, several warnings will be displayed about attempts to request properties that have received the status of deprecated .


5. The output function is responsible for outputting results depending on the execution context. First, it generates a list, sorted in a more familiar vocabulary order (although the caseFirst parameter in Firefox does not work). Then it checks the execution context: in the browser, lists are displayed in two text fields that are embedded in the current page (the name of the file with which the list can be saved using the editor is added to the top of the list); Node.js creates two files in the current directory.


It should be noted that the names of the functions of our script are added to the browser list, and various environment variables are added to the Node.js list; the list also includes various undocumented properties of internal use, array indices, etc. On the other hand, many string elements of the nomenclature (for example, event names or standard string parameters of functions) are not in our list.


Iii. results


After running the script on the latest beta version of Node.js and on the nightly builds of two browsers, I received the following lists (data updated as of 10/18/2016):


Node.js 7.0.0-test201610107f7d1d385d
Terms: 1 822
Chains: 7,394


Google Chrome Canary 56.0.2891.0
Terms: 3 352
Chains: 15,091


Mozilla Firefox Nightly 52.0a1 (2016-10-17)
Terms: 5,082
Chains: 16,125


Perhaps the results of the program may have different uses. For example, a comparison of the nomenclature of different browsers or different versions of the same browser (during testing I noticed that nightly assemblies of neighboring days may produce results that differ in dozens of positions — something is entered, something goes down in history). If you automate the process, you can, for example, create a history of the Node.js API over many versions. And you can collect a variety of language statistics: the depth of attachment of properties, the length of identifiers, the principles of their creation, etc.


Surely the code can be optimized for speed, ease of use, completeness of results or their readability. Also, I could make some stupid mistakes due to ignorance of the subtleties of the language or usage contexts. I would be grateful for the amendments and additions. Thanks for attention.


PS A good example: http://electron.atom.io/blog/2016/09/27/api-docs-json-schema


')

Source: https://habr.com/ru/post/310662/


All Articles