📜 ⬆️ ⬇️

We integrate Copy-Paste-Detection into Xcode, and not only

Tonight, while conducting another code-review in our projects, I came across a large piece of the manifestation of the purest, crystallized copy-paste. He didn’t really like me, and somehow the question immediately surfaced: "Are there many copy-paste in our projects?" Google is my friend, so the solution was found very quickly in jkennedy1980 , who used CPD (copy paste detector), which is included in the PMD (Pretty Much Done || Project Mess Detector || Programming Mistake Detector || ... ). In general, CPD can find copy-paste immediately for a number of languages ​​(cpp, cs, java, php, ruby, ecmascript) and is relatively easy to expand, but I also needed Objective-C. Just this option was at jkennedy1980, who used CPD in the automatic assembly by jenkins . This is generally very good for any project in any language, when jenkins is embedded in the development process, all rights are set, and everyone knows where, when and what to click. In the case when the developers do not know about jenkins, or they know, but it is somewhere far away, this method, to put it mildly, does not fit. Xcode for iPhone / iOS developers, nevertheless, is somehow closer, and although it is still impossible to write a plugin for it ,

Some lyrics


I will make a reservation in advance that some CPD installation items will duplicate what is written in the original with jkennedy1980 . That is, who is interested in the integration of jenkins + cpd - you can immediately go here . Those who are more interested in Xcode + cpd integration - please stay with us.
For those who are too lazy to read about the setting, and who wants to try everything at once: a link to the Xcode project , in which "everything is already stolen before us."

Downloading CPD


Downloading PMD , in which there is a CPD, which we actually need. In my case, I downloaded version 4.6.2 .
')

Swing Objective Tokenizer


CPD does not support Objective-C by default, but thanks to Mike Hall, who described the ObjectiveC grammar for JavaCC , and still jkennedy1980 , we can get a wonderful Objective-C Tokenizer, which will extend the functionality of CPD with another programming language. Download it from github .

Nothing extra


We collect the downloaded files in the folder we need, and try CPD on some victim project.
java
-Xmx512m
-classpath pmd-4.2.5.jar:ObjCLanguage-0.0.1-SNAPSHOT.jar
net.sourceforge.pmd.cpd.CPD
--minimum-tokens 100
--files [Path to XCode project classes]
--language ObjectiveC
--encoding UTF-8
--format net.sourceforge.pmd.cpd.XMLRenderer > cpd-output.xml


The parameters themselves are intuitive, the only more or less questionable parameter at a glance is the minimum-tokens . In a nutshell, this is the number of tokens, with the repetition of which a piece of code can be considered copy-pasted. Set the value to 1 - the whole code will be a complete copy-paste, Set the value to 100500 - it is likely that none will be found. The value of 100 is chosen empirically, according to the coefficients taken from the Stele directory .

After a successful launch, we will get a cpd-output.xml of the form

 <duplication lines = "23" tokens = "110">
 <file line = "20" path = "/.../ CPDObjective-C / CopyPastedFiles / AnotherSimpleClass.m" />
 <file line = "13" path = "/../ CPDObjective-C / CopyPastedFiles / SimpleClass.m" />
 <codefragment>
 <! [CDATA [
 - (void) someMethod {
   
    for (int i = 0; i <10; i ++) {
       for (int j = 0; j <10; j ++) {
          NSLog (@ "This is incorrect");
          NSLog (@ "This is incorrect");
          NSLog (@ "This is incorrect");
          NSLog (@ "This is incorrect");
          ...


We will take data from it to display it in Xcode.

Xcode Integration


In order to integrate XCode and CPD, we will add to the Build Phases of the project target, the Run Script phase, conditionally consisting of several parts:
  1. Actually call cpd
  2. Parsing cpd-output.xml
  3. Output in the "correct format"

I will provide implementation of the parser for XML a little later, but first I will talk about the "correct format". The “correct format” is in this case a format in which you can display messages in the Run Script phase that are processed by Xcode. In general, it looks like this:
[full-path-to-file]:[line-number]:[column-number]: warning: [Message]
[full-path-to-file]:[line-number]:[column-number]: error: [Message]

The task of the script program will be reduced to converting cpd-output.xml to the "correct format".
Writing a script / program that will read the source XML and output the lines we need is easy, but, nevertheless, for example, I post a project that, after building, launches the compiled itself, in order to check its sources for copypast.

And here are a couple of screenshots, how it looks in practice.
In Issue Navigator:


Inside the file:


PS


Most often, copy-paste is an evil that must be fought. Sometimes it’s not worth fighting with. Anything can happen. But, if you still have to, you now have the opportunity to integrate CPD into Xcode.

Pss


The "correct format" for Xcode was found randomly on the Internet. Interesting information about what else can be done from the Run Script phase, so that its result was processed by Xcode - you can write in the comments

The last word


The code presented in the project is not a nominee for the “Best Code of the Year” award. The author writes under MacOSX once a hundred years, and simply showed a method for extending the functionality of Xcode. And the author deliberately made it so that, with some degree of probability, the project may not be assembled on some systems. This is not a bug, this is a feature (s).

Source: https://habr.com/ru/post/137875/


All Articles