Writing custom transformer AST on TypeScript

Team TestMace with you again. This time we publish a translation of an article about converting TypeScript code using the capabilities of the compiler. Enjoy reading!

Introduction

This is my first post, and in it I would like to show the solution of one problem using the TypeScript compiler API . To find this solution, I have been digging through numerous blogs for a long time and digesting the answers to StackOverflow, so in order to save you from the same fate, I will share all that I learned about such a powerful, but poorly documented set of tools.

Key Concepts

Basics of the TypeScript compiler API (parser terminology, transformation API, multi-level architecture), abstract syntax tree (AST), Visitor design pattern, code generation.

Small recommendation

If this is your first time hearing about the concept of AST, I would strongly advise reading this article from @Vaidehi Joshi . Her entire basecs series of articles came out great, you will definitely like it.

Task Description

In Avero, we use GraphQL and would like to add type safety in resolvers. Once I came across graphqlgen , and with it I managed to solve many problems regarding the concept of models in GraphQL. I will not go into this question here - for this I plan to write a separate article. In short, the models describe the return values of resolvers, and in graphqlgen, these models are associated with interfaces through a kind of configuration (a YAML or TypeScript file with type declarations).

While running, we run gRPC microservices, and GQL for the most part serves as a facade. We have already published TypeScript interfaces that are in accordance with proto contracts , and I wanted to use these types as models, but I ran into some problems caused by the support for type export and the way the description of our interfaces was implemented (piling up namespaces, a large number of links).

According to the rules of good open source, my first action was to refine what has already been done in the graphqlgen repository and thus make a meaningful contribution. To implement the introspection mechanism, graphqlgen uses the @ babel / parser parser to read the file and collect information about interface names and declarations (interface fields).

Every time I need to do something with AST, I first open astexplorer.net , and then begin to act. This tool allows you to analyze the AST created by various parsers, including babel / parser and TypeScript compiler parser. Using astexplorer.net, you can visualize the data structures you need to work with and familiarize yourself with the AST node types of each parser.

Take a look at an example of the source data file and the AST created on its basis using babel-parser:

example.ts

import { protos } from 'my_company_protos' export type User = protos.user.User;

ast.json

 { "type": "Program", "start": 0, "end": 80, "loc": { "start": { "line": 1, "column": 0 }, "end": { "line": 3, "column": 36 } }, "comments": [], "range": [ 0, 80 ], "sourceType": "module", "body": [ { "type": "ImportDeclaration", "start": 0, "end": 42, "loc": { "start": { "line": 1, "column": 0 }, "end": { "line": 1, "column": 42 } }, "specifiers": [ { "type": "ImportSpecifier", "start": 9, "end": 15, "loc": { "start": { "line": 1, "column": 9 }, "end": { "line": 1, "column": 15 } }, "imported": { "type": "Identifier", "start": 9, "end": 15, "loc": { "start": { "line": 1, "column": 9 }, "end": { "line": 1, "column": 15 }, "identifierName": "protos" }, "name": "protos", "range": [ 9, 15 ], "_babelType": "Identifier" }, "importKind": null, "local": { "type": "Identifier", "start": 9, "end": 15, "loc": { "start": { "line": 1, "column": 9 }, "end": { "line": 1, "column": 15 }, "identifierName": "protos" }, "name": "protos", "range": [ 9, 15 ], "_babelType": "Identifier" }, "range": [ 9, 15 ], "_babelType": "ImportSpecifier" } ], "importKind": "value", "source": { "type": "Literal", "start": 23, "end": 42, "loc": { "start": { "line": 1, "column": 23 }, "end": { "line": 1, "column": 42 } }, "extra": { "rawValue": "my_company_protos", "raw": "'my_company_protos'" }, "value": "my_company_protos", "range": [ 23, 42 ], "_babelType": "StringLiteral", "raw": "'my_company_protos'" }, "range": [ 0, 42 ], "_babelType": "ImportDeclaration" }, { "type": "ExportNamedDeclaration", "start": 44, "end": 80, "loc": { "start": { "line": 3, "column": 0 }, "end": { "line": 3, "column": 36 } }, "specifiers": [], "source": null, "exportKind": "type", "declaration": { "type": "TypeAlias", "start": 51, "end": 80, "loc": { "start": { "line": 3, "column": 7 }, "end": { "line": 3, "column": 36 } }, "id": { "type": "Identifier", "start": 56, "end": 60, "loc": { "start": { "line": 3, "column": 12 }, "end": { "line": 3, "column": 16 }, "identifierName": "User" }, "name": "User", "range": [ 56, 60 ], "_babelType": "Identifier" }, "typeParameters": null, "right": { "type": "GenericTypeAnnotation", "start": 63, "end": 79, "loc": { "start": { "line": 3, "column": 19 }, "end": { "line": 3, "column": 35 } }, "typeParameters": null, "id": { "type": "QualifiedTypeIdentifier", "start": 63, "end": 79, "loc": { "start": { "line": 3, "column": 19 }, "end": { "line": 3, "column": 35 } }, "qualification": { "type": "QualifiedTypeIdentifier", "start": 63, "end": 74, "loc": { "start": { "line": 3, "column": 19 }, "end": { "line": 3, "column": 30 } }, "qualification": { "type": "Identifier", "start": 63, "end": 69, "loc": { "start": { "line": 3, "column": 19 }, "end": { "line": 3, "column": 25 }, "identifierName": "protos" }, "name": "protos", "range": [ 63, 69 ], "_babelType": "Identifier" }, "range": [ 63, 74 ], "_babelType": "QualifiedTypeIdentifier" }, "range": [ 63, 79 ], "_babelType": "QualifiedTypeIdentifier" }, "range": [ 63, 79 ], "_babelType": "GenericTypeAnnotation" }, "range": [ 51, 80 ], "_babelType": "TypeAlias" }, "range": [ 44, 80 ], "_babelType": "ExportNamedDeclaration" } ] }

The root of the tree (a node of type Program ) contains in its body two operators, ImportDeclaration and ExportNamedDeclaration .

In ImportDeclaration, we are particularly interested in two properties - source and specifiers , which contain information about the source text. For example, in our case, the value of source is my_company_protos . By this value it is impossible to understand, relative is the path to the file or a link to an external module. That is what the parser has to do.

Similarly, information about the source text is also contained in ExportNamedDeclaration . Namespaces only complicate this structure by adding arbitrary nesting to it, with the result that more and more QualifiedTypeIdentifiers appear. This is another task that we have to solve within the framework of the chosen approach with the parser.

But I have not even reached the resolution of types from imports! Considering that the parser and the AST by default give a limited amount of information about the source text, in order to add this information to the final tree, it is necessary to parse all the imported files. But each such file can have its own imports!

It seems that solving the tasks with the help of the parser, we will get too much code ... Let's take a step back and think again.

Imports are not important to us, and the file structure is not important either. We want to be able to resolve all properties of type protos.user.User and embed them instead of using references to imports. And where to get the necessary information about the types to make a new file?

Typechecker

Since we found out that the solution with the parser is not suitable for obtaining information about the types of imported interfaces, let's look at the TypeScript compilation process and try to find another way out.

This is what immediately comes to mind:

TypeChecker is the basis of the TypeScript type system, and it can be created from a Program instance. He is responsible for the interaction of characters from different files among themselves, specifying the types of characters and conducting semantic verification (for example, error detection).
The first thing TypeChecker does is to collect all the characters from different source files into one representation, and then create a single character table, producing a "merge" of the same characters (for example, namespaces found in several different files).
After initialization of the initial state, TypeChecker is ready to provide answers to any questions about the program. These questions can be:
Which character corresponds to this node?
What is this character type?
What characters are visible in this part of AST?
What are the signatures available for function declaration?
What errors should be displayed for this file?

TypeChecker is exactly what we needed! With access to the symbol table and API, we can answer the first two questions: Which symbol corresponds to a given node? What is this character type? Thanks to the merging of all common characters, TypeChecker can even solve the problem with the accumulation of namespaces, which was mentioned earlier!

So how do you get to this API?

Here is one of the examples I could find on the net. It shows that access to TypeChecker can be obtained through the instance method of the Program. It has two interesting methods - checker.getSymbolAtLocation and checker.getTypeOfSymbolAtLocation , which look very similar to what we are looking for.

Let's start working on the code.

models.ts

 import { protos } from './my_company_protos' export type User = protos.user.User;

my_company_protos.ts

 export namespace protos { export namespace user { export interface User { username: string; info: protos.Info.User; } } export namespace Info { export interface User { name: protos.Info.Name; } export interface Name { firstName: string; lastName: string; } } }

ts-alias.ts

 import ts from "typescript"; // hardcode our input file const filePath = "./src/models.ts"; // create a program instance, which is a collection of source files // in this case we only have one source file const program = ts.createProgram([filePath], {}); // pull off the typechecker instance from our program const checker = program.getTypeChecker(); // get our models.ts source file AST const source = program.getSourceFile(filePath); // create TS printer instance which gives us utilities to pretty print our final AST const printer = ts.createPrinter(); // helper to give us Node string type given kind const syntaxToKind = (kind: ts.Node["kind"]) => { return ts.SyntaxKind[kind]; }; // visit each node in the root AST and log its kind ts.forEachChild(source, node => { console.log(syntaxToKind(node.kind)); });

 $ ts-node ./src/ts-alias.ts prints ImportDeclaration TypeAliasDeclaration EndOfFileToken

We are only interested in the declaration of the type alias, so we will rewrite the code a bit:

kind-printer.ts

 ts.forEachChild(source, node => { if (ts.isTypeAliasDeclaration(node)) { console.log(node.kind); } }) // prints TypeAliasDeclaration

TypeScript provides protection for each type of node, with which you can find out the exact type of node:

Now, back to the two questions that were posed earlier: Which symbol corresponds to a given node? What is this character type?

So, we got the names entered by the interface type alias declarations using the TypeChecker character table interaction . While we are still at the very beginning, but this is a good starting position in terms of introspection .

checker-example.ts

 ts.forEachChild(source, node => { if (ts.isTypeAliasDeclaration(node)) { const symbol = checker.getSymbolAtLocation(node.name); const type = checker.getDeclaredTypeOfSymbol(symbol); const properties = checker.getPropertiesOfType(type); properties.forEach(declaration => { console.log(declaration.name); // prints username, info }); } });

And now we will think over code generation .

Transformation API

As indicated earlier, our goal is to parse and introspect the TypeScript source file and create a new file. AST conversion -> AST is so often used that the TypeScript command even thought about the API to create custom transformers !

Before moving on to the main task, let's try creating a simple transformer. Special thanks to James Garbutt for the original template for him.

Let the transformer change the numeric literals to string.

number-transformer.ts

 const source = ` const two = 2; const four = 4; `; function numberTransformer<T extends ts.Node>(): ts.TransformerFactory<T> { return context => { const visit: ts.Visitor = node => { if (ts.isNumericLiteral(node)) { return ts.createStringLiteral(node.text); } return ts.visitEachChild(node, child => visit(child), context); }; return node => ts.visitNode(node, visit); }; } let result = ts.transpileModule(source, { compilerOptions: { module: ts.ModuleKind.CommonJS }, transformers: { before: [numberTransformer()] } }); console.log(result.outputText); /* var two = "2"; var four = "4";

The most important part of it is the Visitor and VisitorResult :

 type Visitor = (node: Node) => VisitResult<Node>; type VisitResult<T extends Node> = T | T[] | undefined;

The main goal when creating a transformer is to write Visitor . Logically, you need to implement the recursive passing of each AST node and returning the result of a VisitResult (one, several or zero AST nodes). You can configure the converter so that only the selected nodes will change.

input-output.ts

 // input export namespace protos { // ModuleDeclaration export namespace user { // ModuleDeclaration // Module Block export interface User { // InterfaceDeclaration username: string; // username: string is PropertySignature info: protos.Info.User; // TypeReference } } export namespace Info { export interface User { name: protos.Info.Name; // TypeReference } export interface Name { firstName: string; lastName: string; } } } // this line is a TypeAliasDeclaration export type User = protos.user.User; // protos.user.User is a TypeReference // output export interface User { username: string; info: { // info: { .. } is a TypeLiteral name: { // name: { .. } is a TypeLiteral firstName: string; lastName: string; } } }

Here you can see with which nodes we will work.

Visitor must perform two basic actions:

Replacing TypeAliasDeclarations with InterfaceDeclarations
Convert TypeReferences to TypeLiterals

Decision

This is what the Visitor code looks like:

aliasTransformer.ts

 import path from 'path'; import ts from 'typescript'; import _ from 'lodash'; import fs from 'fs'; const filePath = path.resolve(_.first(process.argv.slice(2))); const program = ts.createProgram([filePath], {}); const checker = program.getTypeChecker(); const source = program.getSourceFile(filePath); const printer = ts.createPrinter(); const typeAliasToInterfaceTransformer: ts.TransformerFactory<ts.SourceFile> = context => { const visit: ts.Visitor = node => { node = ts.visitEachChild(node, visit, context); /* Convert type references to type literals interface IUser { username: string } type User = IUser <--- IUser is a type reference interface Context { user: User <--- User is a type reference } In both cases we want to convert the type reference to it's primitive literals. We want: interface IUser { username: string } type User = { username: string } interface Context { user: { username: string } } */ if (ts.isTypeReferenceNode(node)) { const symbol = checker.getSymbolAtLocation(node.typeName); const type = checker.getDeclaredTypeOfSymbol(symbol); const declarations = _.flatMap(checker.getPropertiesOfType(type), property => { /* Type references declarations may themselves have type references, so we need to resolve those literals as well */ return _.map(property.declarations, visit); }); return ts.createTypeLiteralNode(declarations.filter(ts.isTypeElement)); } /* Convert type alias to interface declaration interface IUser { username: string } type User = IUser We want to remove all type aliases interface IUser { username: string } interface User { username: string <-- Also need to resolve IUser } */ if (ts.isTypeAliasDeclaration(node)) { const symbol = checker.getSymbolAtLocation(node.name); const type = checker.getDeclaredTypeOfSymbol(symbol); const declarations = _.flatMap(checker.getPropertiesOfType(type), property => { // Resolve type alias to it's literals return _.map(property.declarations, visit); }); // Create interface with fully resolved types return ts.createInterfaceDeclaration( [], [ts.createToken(ts.SyntaxKind.ExportKeyword)], node.name.getText(), [], [], declarations.filter(ts.isTypeElement) ); } // Remove all export declarations if (ts.isImportDeclaration(node)) { return null; } return node; }; return node => ts.visitNode(node, visit); }; // Run source file through our transformer const result = ts.transform(source, [typeAliasToInterfaceTransformer]); // Create our output folder const outputDir = path.resolve(__dirname, '../generated'); if (!fs.existsSync(outputDir)) { fs.mkdirSync(outputDir); } // Write pretty printed transformed typescript to output directory fs.writeFileSync( path.resolve(__dirname, '../generated/models.ts'), printer.printFile(_.first(result.transformed)) );

I like the way my decision looks. It embodies the power of good abstractions, an intelligent compiler, useful development tools (auto-completion of VSCode, AST explorer, etc.) and bits of experience from other skilled developers. Its full source code with updates can be found here . Not sure how useful it will be for more general cases, different from my private one. I just wanted to show the capabilities of the TypeScript compiler toolkit, and also to put my thoughts on the solution of a non-standard problem that had been bothering me for a long time.

I hope that by my example I will help someone to simplify their lives. If the topic of AST, compilers and transformations is not completely clear to you, then follow the links to third-party resources and templates that I have provided, they should help you. I had to spend a lot of time studying this information to finally find a solution. My first attempts in private Github repositories, including 45 // @ts-ignores and // @ts-ignores , make me blush with shame.

Resources that helped me:

Microsoft / TypeScript

Creating a TypeScript Transformer

TypeScript compiler APIs revisited

AST explorer

Our team creates a cool TestMace tool - a powerful IDE for working with APIs. Create scripts, test endpoints and use all the power of advanced autocompletion and syntax highlighting. Write to us! We are here: Telegram , Slack , Facebook , Vk

Source: https://habr.com/ru/post/457770/

All Articles