I want to share with you the experience of rewriting from java to C ++ on the Android platform and what happened as a result.
For its small home project, the Viola-Jones face search algorithm was used, the java-sources with the model were taken from here
code.google.com/p/jviolajones with a slight modification - two classes were added: Point and Rectangle. I’ll clarify why I didn’t use OpenCV for Android - for its work it is necessary to install a separate library application, which in my case would be very inconvenient, and experiments showed it to fall without warning, did not deal with this for a long time, and also with the search for other libraries decided to take the simplest ready implementation.
The speed of the algorithm showed deplorable results, in a 400 by 300 photo on my old broken GT-I9300I - 54 seconds, on avd (virtual device) and even longer - 250 seconds.
Often came across my discussion of the speed of code in java and C ++, somewhere it was shown that java is lagging behind, in some cases even vice versa, small sections of code with one cycle were cited. Here, the algorithm is a bit more complicated than the order of 6 nested loops, as you can see from the source code. Therefore, it was decided to try out the rewriting in C ++ by own experience. For all the articles I read, I got the impression that the speed would increase by a maximum of 20 percent, but as it turned out it was wrong.
')
The following tasks naturally arose - how to pass input and get output data and how to rewrite the code. Filling the xml model in the constructor Detector decided to leave on java, which is filled, of course, not quickly, but since working with xml in C ++ sounds very scary for me, I left it as it is. The type of my professional activity is associated with java, I was associated with C \ C ++ only at the institute and a little at work on old projects. So I had to study a bit of documentation, read the articles and fill up some cones.
Rewriting logic There were no special problems, a method was adopted - without looking to copy the classes, where eclipse highlighted the red ones it was covered with a hatchet. All ArrayList'y remade into an array, good - they did not change the size.
I will not describe the environment setting for calling native code, there are a lot of articles on this topic.
How to transfer data. With simple types - int, float, boolean, everything is simple and clear. With one-dimensional, it seems to be easy too:
JNIEXPORT jint JNICALL Java_com_example_Computations_intFromJni(JNIEnv* env, jobject thiz, jintArray arr) { jsize d = env->GetArrayLength(arr); jboolean j; int * p = env->GetIntArrayElements(arr, &j); ... }
With a two-dimensional bit more complicated:
JNIEXPORT jint JNICALL Java_com_example_Computations_findFaces(JNIEnv* env, jobject thiz, jobjectArray image) { int width = env -> GetArrayLength(image); jboolean j2; jintArray dim= (jintArray)env->GetObjectArrayElement(image, 0); int height = env -> GetArrayLength(dim); int **imageLocal; imageLocal = new int*[width]; for (int i = 0; i < width; i++) { jintArray oneDim= (jintArray)env->GetObjectArrayElement(image, i); int *element = env->GetIntArrayElements(oneDim, &j2); imageLocal[i] = new int[height]; for(int j=0; j < height; ++j) { imageLocal[i][j]= element[j]; } } ... }
Let's go further, how to pass objects that have a bunch of fields, among which there are List types. To obtain the object field, the following construction is used:
jclass clsDetector = env->GetObjectClass(objDetector); jfieldID sizeFieldId = env->GetFieldID(clsDetector, "size", "Ldetection/Point;"); jobject pointObj = env->GetObjectField(objDetector, sizeFieldId);
For sheets, we need two get and size methods:
jfieldID stagesFieldId = env->GetFieldID(clsDetector, "stages", "Ljava/util/List;"); jobject stagesList = env->GetObjectField(detectorJObj, stagesFieldId); jclass listClass = env->FindClass( "java/util/List" ); jmethodID getMethodIDList = env->GetMethodID( listClass, "get", "(I)Ljava/lang/Object;" ); jmethodID sizeMethodIDList = env->GetMethodID( listClass, "size", "()I" ); int listStagesCount = (int)env->CallIntMethod( stagesList, sizeMethodIDList ); for( int i=0; i < listStagesCount; ++i ) { jobject stage = env->CallObjectMethod( stagesList, getMethodIDList, i); ...
Data learned to get. We launch, falls on an error - Local reference table overflow 512 entries. It turns out that it is necessary to clean all local links jclass and jobject, this is done like this:
env->DeleteLocalRef(jcls)
And for arrays too:
env->ReleaseIntArrayElements(oneDim, element, JNI_ABORT)
Return the result. To simplify your task, returning the result is made in the form of an array Rectangle:
jclass cls = env->FindClass("detection/Rectangle"); jobjectArray jobAr =env->NewObjectArray(faces->currIndex, cls, NULL); jmethodID constructor = env->GetMethodID(cls, "<init>", "(IIII)V"); for (int i = 0; i < faces->currIndex; i++) { Rectangle* re = faces->rects[i]; jobject object = env->NewObject(cls, constructor, re->x, re->y, re->width, re->height); env->SetObjectArrayElement(jobAr, i, object); } return jobAr;
So, the solemn moment - search in the same photo - 14 seconds, i.e. 4 times faster, in other photos similar results. On a virtual android, 132 seconds versus 300 seconds. But as we know, it is impossible to use the results of one experiment; it is necessary to repeat several times, for one photo, the processing time in seconds.
Virtual device | Virtual device using cpp | My phone is galaxy | My phone is galaxy with cpp |
---|
238 | 132 | 84 | 14 |
318 | 137 | 54 | 14 |
472 | 135 | 54 | 14 |
264 | 150 | 54 | 14 |
266 | 138 | 54 | 14 |
262 | 129 | 53 | 14 |
And in conclusion, I note. Despite the fact that rewriting has given a lot of acceleration, there is still a lot of limit to perfection, you can use multithreading, which I plan to study in the near future. And making any adjustments to the algorithm is probably the most difficult part.
update Posted
source . Use as follows:
Detector detector = Detector.create(inputHaas); List<Rectangle> res = detector.getFaces(background_image, 1.2f, 1.1f, .05f, 2, true, useCpp);
inputHaas is the model stream, i.e. the haarcascade_frontalface_default.xml file from the original algorithm, useCpp - use C ++ or not. In C ++ sources, I don’t do a freeing of memory, since wrote in a hurry.