Hash strings at compile time with annotation

Recently, I began to develop an application for Android and I faced the task of protecting it from reverse. A quick look at Google suggested that ProGuard, part of Android Studio, will cope with the task. The result really suited me except for one small detail - the line.
The program communicates with the service information using the Intent. The key part of which is the action string. And if a line must have a certain format for interaction with the system or other applications, then its uniqueness is sufficient for exchanges within the application. For convenience, it is recommended to compile this string from the package name and the action name. For example:

public final class HandlerConst { public static final String ACTION_LOGIN = "com.example.app.ACTION_LOGIN"; }

This is convenient for debugging, but it greatly reduces the quality of code obfuscation. I would like to see, for example, in the release of the program, instead of this line, its MD5 hash.

 public final class HandlerConst { public static final String ACTION_LOGIN = "7f315954193d1fd99b017081ef8acdc3"; }

Under the cut it is told how to achieve this behavior with the help of an improvised bicycle.

Some lyrics

I was very surprised to learn that ProGuard does not work with strings. From the documentation on the official website, it was possible to find out that the advanced paid version can work with strings. That's just it encrypts strings in order to decrypt them into the original version while the program is running. I could not find a solution to turn the string into its MD5 value.
Attempts to find a solution to this problem led me to an article demonstrating the wonders of optimizing C ++ compilers: Computing CRC32 strings in compile-time . But in Java, a similar method did not take off. ProGuard turned the methods down hard enough, but stumbled upon getting an array of bytes from a string.
After that, I decided not to waste effort on trying to automate and just solve the problem with my hands:

 public final class HandlerConst { public static final String ACTION_LOGIN; static { if (BuildConfig.DEBUG) ACTION_LOGIN = "com.example.app.ACTION_LOGIN"; else ACTION_LOGIN = "7f315954193d1fd99b017081ef8acdc3"; } }

But when I saw the Custom Annotation Preprocessor article on Habré - the creation of an Android application and configuration in IntelliJ IDEA , I realized that this is the solution to my problem.

Annotation implementation

The study of annotations by tradition began with the lack of necessary information in Russian. Most articles cover runtime annotations. However, a suitable article was found in Habré: Calculation of the execution time of the method through the annotation .
To create a compile-time annotation we need:

Describe the annotation;
Implement the heir of the AbstractProcessor class, which will process our annotation;
Tell the compiler where to look for our processor.

An annotation description might look like this:

 package com.example.annotation; @Target({ElementType.FIELD}) @Retention(RetentionPolicy.SOURCE) public @interface Hashed { String method() default "MD5"; }

Target - defines objects for which annotation is applicable. In this case, the annotation can be applied to variable declarations in the class. Unfortunately, to anyone, but more on that later.
Retention - annotation lifetime. We indicate that it exists only in the source code.
In the annotation itself, we start a field defining the method of hashing. The default is MD5.
This is enough to use annotation in the code, but there is no point in it until we write an annotation processor.
')
An annotation handler is inherited from javax.annotation.processing.AbstractProcessor . The minimum class of the handler looks like this:

 package com.example.annotation; @SupportedAnnotationTypes(value = {"com.example.annotation.Hashed"}) @SupportedSourceVersion(SourceVersion.RELEASE_7) public class HashedAnnotationProcessor extends AbstractProcessor { @Override public boolean process(Set<? extends TypeElement> annotations, RoundEnvironment roundEnv) { return false; } }

SupportedAnnotationTypes - defines the names of annotation classes that will be processed by our processor.
SupportedSourceVersion - supported source version. The point is that the processor does not break when processing annotations language constructs that appeared in newer versions of the language.
Instead of annotation data, you can override the getSupportedAnnotationTypes and getSupportedSourceVersion methods .
The process method gets a list of the raw supported annotations and a compiler interaction object. If the method returns false, the compiler passes the annotation for processing to the next processor that supports this type of annotation. If the method returns true, the annotation is considered processed and will not get anywhere else. This should be taken into account in order not to accidentally nail other people's annotations.
If during the operation of any processor the source codes have changed or been added, the compiler will go to the next pass.

To change the source code RoundEnvironment will not be enough for us, so we override the init method and get JavacProcessEnvironment from it. This class allows you to access source codes, a system for throwing warnings and compilation errors, and more. In the same place we will receive TreeMaker - the auxiliary tool for change of source codes.

  private JavacProcessingEnvironment javacProcessingEnv; private TreeMaker maker; @Override public void init(ProcessingEnvironment procEnv) { super.init(procEnv); this.javacProcessingEnv = (JavacProcessingEnvironment) procEnv; this.maker = TreeMaker.instance(javacProcessingEnv.getContext()); }

Now it remains for us to go through our annotated fields and replace the values of string constants. I bring the code in abbreviation. Link to GitHub at the end of the article.

  @Override public boolean process(Set<? extends TypeElement> annotations, RoundEnvironment roundEnv) { if ( annotations == null || annotations.isEmpty()) { return false; } for (TypeElement annotation : annotations) { //   ,      final Set<? extends Element> fields = roundEnv.getElementsAnnotatedWith(annotation); JavacElements utils = javacProcessingEnv.getElementUtils(); for (final Element field : fields) { // ,      . Hashed hashed = field.getAnnotation(Hashed.class); //     JCTree blockNode = utils.getTree(field); if (blockNode instanceof JCTree.JCVariableDecl) { //,       . JCTree.JCVariableDecl var = (JCTree.JCVariableDecl) blockNode; //  (    = ) JCTree.JCExpression initializer = var.getInitializer(); //      ,     : // "" + 1 // new String("new string") if ((initializer != null) && (initializer instanceof JCTree.JCLiteral)){ JCTree.JCLiteral lit = (JCTree.JCLiteral) initializer; //  String value = lit.getValue().toString(); try { MessageDigest md = MessageDigest.getInstance(hashed.method()); //      . md.update(value.getBytes("UTF-8")); byte[] hash = md.digest(); StringBuilder str = new StringBuilder(hash.length * 2); for (byte val : hash) { str.append(String.format("%02X", val & 0xFF)); } value = str.toString(); lit = maker.Literal(value); var.init = lit; } catch (NoSuchAlgorithmException e) { // :    } catch (UnsupportedEncodingException e) { // :   ?? } }else{ // :   . } } } } }

In the method we run according to the list of annotations (do we remember that in general the processor processes more than one annotation?), For each annotation we select the list of elements. After this magic begins. We use the tools from the delivery com.sun.tools.javac to convert the elements into a tree of source code, which has a huge number of possibilities and traditionally a complete lack of Russian-language documentation. Therefore, I ask you not to be surprised that the code for working with this tree is far from ideal.
When we received the declaration of a variable in the form of a tree JCTree.JCVariableDecl var - we can make sure that it is a string variable. In my case, this test is carried out with a crutch:

 if (!"String".equals(var.vartype.toString())){ // :     . continue; }

vartype is a field type that can certainly be compared with any constant or determine its membership in a particular class, but, as I said, there is no documentation, and a quick check showed that casting to a string gives us the name of the type.

The second interesting point is that we can process only lines similar to the example from the very beginning of the article. The thing is that at this stage we are working with the source text. Therefore, if the variable is initialized in the constructor, then JCTree.JCExpression initializer = var.getInitializer (); will return us null . Not less unpleasant situation will turn out if we try to process constructions of the form:

 public String demo1 = new String("habrahabr"); public String demo2 = "habra"+"habr"; public String demo3 = "" + 1;

For this, a second check is introduced (initializer instanceof JCTree.JCLiteral) . This cuts out all the examples described, since they are not literals in their pure form and will be represented in the tree by an expression of several elements.
Further code is obvious. Take a string, hash, replace, rejoice? Not.
Comments noted several places in which there are obvious errors. And in our case, ignoring them is not the correct behavior. In order to inform the user about the error, we need a javax.annotation.processing.Messager object. It allows you to throw out a warning, compilation error, or just an informational message. For example, we can report an invalid hash algorithm:

 catch (NoSuchAlgorithmException e) { javacProcessingEnv.getMessager().printMessage(Diagnostic.Kind.ERROR, String.format("Unsupported digest method %s", hashed.method()), field); }

It should be understood that the release of an error message does not interrupt the execution of the method. The compiler will wait for at least the end of our method before interrupting the compilation process. This allows you to immediately throw away all errors in the application of annotations to the user. The third argument of the printMessage method allows us to specify the object on which we stumbled. It is not mandatory, but makes life much easier.

Connecting Annotation Processor

It remains to inform the compiler that we are and are ready to accept annotations to be torn apart. Many articles contain instructions on how to add your processor to <development environment name>. Apparently this is rooted in the distant times, when such things were done on the knees of folk craftsmen. However, the annotation processing engine has been part of javac for quite some time and, in fact, our class processor is a plugin for javac. This means that by standard means we can connect our annotation to any environment without shamanism with settings.
We will need to create a services subdirectory in the META-INF directory, and in it the javax.annotation.processing.Processor file. In the file itself, we need to put a list of our processor classes. In the specific case of com.example.annotation.HashedAnnotationProcessor . And that's all. Now we compile our library containing the annotation and its processor. We connect this library to the project. And it works.
At the same time, neither the library itself, nor the remnants of annotations will fall into the compiled code.

Using

Abstract is ready. Strings are hashed. But the problem is still not solved.
If we connect the annotation to the project in this form, we will always hash the lines. And we only need to release.
In Java, the concept of debug and release builds is very conditional and depends on the user's views. Therefore, we ensure that the assembleDebug task for the Android project does not hash lines, and in all other cases MD5 hashes remain from the lines.
To solve this problem, we will pass an additional parameter to our annotation processor.
First, we will refine the processor:

 @SupportedOptions({"Hashed"}) public class HashedAnnotationProcessor extends AbstractProcessor { private boolean enable = true; @Override public void init(ProcessingEnvironment procEnv) { //  java.util.Map<java.lang.String,java.lang.String> opt = javacProcessingEnv.getOptions(); if (opt.containsKey(ENABLE_OPTIONS_NAME) && opt.get(ENABLE_OPTIONS_NAME).equals("disable")){ enable = false; } } @Override public boolean process(Set<? extends TypeElement> annotations, RoundEnvironment roundEnv) { if (!enable){ javacProcessingEnv.getMessager().printMessage(Diagnostic.Kind.NOTE, "Annotation Hashed is disable"); return false; } //... } }

We announced that we are waiting for the “Hashed” option and if it is “disable”, then we do nothing and output the information to the user. Messages such as Diagnostic.Kind.NOTE are informational and with the default settings many development environments will not show these messages at all.
At the same time, we inform the compiler that we did not process the annotation. If there are more processors in the system that process annotations of this type, or do not understand the type at all, they can receive our annotation. True, I can say absolutely nothing about the order in which the compiler will try to manage the annotation. So far we have only our library and exactly one annotation - this is not relevant, but when using several libraries of annotations be prepared for the emergence of pitfalls.
It remains to pass this option to the compiler. Options for processors are passed to the compiler with the "-A" key. In our case, "-AHashed = disable".
It remains only to set Gradle to pass this option at the right moment. And again crutches:

 tasks.withType(JavaCompile) { if (name == "compileDebug"){ options.compilerArgs << "-AHashed=disable" } }

This is for the current version of Android Studio. For earlier tasks.withType (Compile).
A crutch, because this block is called for each type of assembly, regardless of the task. In theory, there should be something similar to buildTypes from the android block, but I no longer had the strength to look for a beautiful solution. All have already guessed that the documentation in Russian is traditionally not?
In code, annotations might look like this:

  @Hashed public static final String demo1 = "habr"; @Hashed (method="SHA-1") public static final String demo2 = "habrahabr"; @Hashed(method="SHA-256") public static final String demo3 = "habracadabra";

The method can be any of the supported MessageDigest .

Total

Problem solved. Of course, only for one very specific way of declaring constants, of course, not the most effective way, and for many, the very formulation of the problem will raise more questions than the material of the article. And I just hope that someone will spend less time and nerves if they meet a similar task on their way.
But even more, I hope that someone will be interested in this topic and Habr will see articles in which it will be told why all this magic works.
And, of course, the promised code: GitHub :: DemoAnnotation

Source: https://habr.com/ru/post/200878/

All Articles