Optimize SQL insertion speed on Android devices

Good day.

During the development of my project for Android, I encountered the task of updating a large number of rows in an SQLite database directly on the device.

The initial solution in the forehead gave terribly slow results, because update had more than 40,000 lines. About how I improved the performance of these updates rows in the database, and the story goes.

More detailed description of the task:

The application for Android was distributed with a SQLite base inside (inside the APK, in assets). The database was information on the cities. But the information is basic, language-independent, and language-dependent fields were only in English.
To distribute the program with all languages would be unrealistic, since each language in the database would add the original APK file to the installer from 1-2 MB. And languages supported 11.
')
Therefore, language patches were invented to the database, which would be downloaded from the Internet (from the server), and rolled onto the base on the device.
The patch was a text file clamped by gzip, each line of which contained values separated by a tab (\ t).

There were no problems with downloading. It is fast. But the bottleneck of this scheme is the insertion into the base on the device.
I wrote the first version in Java using the well-known SQLite capabilities for working with SQLite.
It was necessary to update 3 fields in the line (add values from the patch). Here are just such lines were from 20,000 to 60,000, depending on the language.

The first version of Java code

The first version looked something like this (I don’t pretend to the accuracy of the code, there is nothing left from the first versions, since they were all rewritten and were not saved anywhere. I cite the code to display ideas and bottlenecks so that people do not repeat the mistakes I made ).

try { buffRead = new BufferedReader(fileIn, (1000*1024)); // 1000 KB String line; int lineNum = 0; while ((line = buffRead.readLine()) != null) { try { String[] values = line.split("\t"); if (values.length < 2) // cause 3rd value van be empty { // some error, try next line continue; } int idInt = Integer.valueOf(values[0]); String name = values[1]; getDb().execSql("update " + getTableNabe() + " set " + lang.getColumnName() + " = ? where " + COLUMN_ID + " = ? ", new String[] { name, String.valueOf(idInt) }); getDb().execSql("update " + getTableNabe() + " set " + lang.getColumnSort() + " = ? where " + COLUMN_ID + " = ? ", new String[] { String.valueOf(lineNum++), String.valueOf(idInt) }); if (values.lengh == 3 && values[2].length != 0) { String data = values[2]; getDb().execSql("update "+ getTableNabe() + " set " + lang.getColumnData() + " = ? where " + COLUMN_ID + " = ? ", new String[] { data, String.valueOf(idInt) }); } } catch (NumberFormatException e) { e.printStackTrace(); return false; } catch (SQLException e) { e.printStackTrace(); return false; } createIndexOnLang(lang); }// end of while } catch (IOException e) { e.printStackTrace(); return false; } finally { if (buffRead != null) { try { buffRead.close(); } catch (IOException e) { e.printStackTrace(); } } } return true;

Well, naturally, such a solution worked super slowly. More precisely, to say that slowly is to say nothing. The code added lines to the database in tens of minutes.
The first thing that asked for was to add a transaction.

Added before cycle

  getDb().beginTransaction();

After cycle added

  getDb().setTransactionSuccessful();

And in block finaly

  getDb().endTransaction();

This gain is not great. Then I remembered that in Eclipse, after installing Android Developer Tools ( ADT ) there, there is a great prospect of DDMS , in which there is an excellent function of function profiling.

How to use it, you can read here . On the habr, by the way, did not find the articles describing this ADT functional (I could not search well, but I found only about memory analysis .)

Drawdown performance

With the help of this profiling mechanism, it became immediately apparent that the drawdown in performance was in my following places.
1. Spit terribly slow method. It may not be a revelation to anyone, but I was surprised.
2. Work with strings, in terms of pasting strings for queries. Inside execSql, as you can see, a bunch of new StringBuilders were made in the loop each time, which were then discarded as unnecessary. As written here , never in cycles, do not glue the lines by means of a plus (+). Use one prearranged StringBuilder. And even better, in general, prepare the lines in advance, before the cycle. What, in my case, is the most obvious improvement.
3. The work of the SQL itself within the android libraries. A lot of locks and unlocks were made there. After reading the documentation, I found a method for the setLockingEnabled database, setting it to false, we get a good speed increase.
4. Prepare a SQLiteStatement for each execSql call. This is also an expensive operation. A little nagging, I found that it was possible to prepare in advance, as well as the lines, in advance, and in the loop just by loading parameters into them, and then right there.

Second version of java code

Having solved all these problems, getting rid of split, removing SQLiteStatement training from a cycle, removing all work with strings from a cycle, adding getDb (). SetLockingEnabled (false) before a cycle, I got this option

 try { buffRead = new BufferedReader(fileIn, (1000*1024)); // 1000 KB String line; int lineNum = 0; checkDbErrors(); getDb().beginTransaction(); getDb().setLockingEnabled(false); // Prepare SQL queries String updateStatment = "update " + getTableName() + " set "; String whereStatment = " where " + COLUMN_ID + " = ?"; String updateNameSQL = updateStatment + lang.getColumns().getColumnName() + " = ? " + whereStatment; String updatqDataSQL = updateStatment + lang.getColumns().getColumnData() + " = ? " + whereStatment; String updatqSortSQL = updateStatment + lang.getColumns().getColumnSort() + " = ? " + whereStatment; SQLiteStatement updateName = getDb().compileStatement(updateNameSQL); SQLiteStatement updateData = getDb().compileStatement(updatqDataSQL); SQLiteStatement updateSort = getDb().compileStatement(updatqSortSQL); while ((line = buffRead.readLine()) != null) { try { int idInt = parseIdFromString(line); String name = parseNameFromString(line, line.indexOf('\t') + 1); String data= parseDataFromString(line, name.length() + 1); updateName.bindString(1, name); updateName.bindLong(2, idInt); updateName.execute(); if (data.length() != 0) { updateWiki.bindString(1, data); updateWiki.bindLong(2, idInt); updateWiki.execute(); } updateSort.bindLong(1, lineNum++); updateSort.bindLong(2, idInt); updateSort.execute(); } catch (NumberFormatException e) { e.printStackTrace(); return false; } catch (SQLException e) { e.printStackTrace(); return false; } } getDb().setTransactionSuccessful(); } catch (IOException e) { e.printStackTrace(); return false; } finally { getDb().endTransaction(); if (buffRead != null) { try { buffRead.close(); } catch (IOException e) { e.printStackTrace(); } } } return true;

Methods
parseIdFromString (String line),
parseNameFromString (String line, int from) and
parseDataFromString (String line, int from) is extremely clumsy, but works faster than the split version

Here they are, who are interested:

Helper code

  private int parseIdFromString(String line) { int ind = line.indexOf('\t'); if (ind == -1) { return 0; } String idStr = line.substring(0, ind); int length = idStr.length(); if (length == 0) { return 0; } int result = 0; int zero = '0'; for (int i = 0; i < length; ++i) { result += (idStr.charAt(i) - zero) * Math.pow(10, length - i - 1); } return result; } private String parseNameFromString(String line, int from) { int ind = line.indexOf('\t', from); if (ind == -1) { return new String(); } return line.substring(from, ind); } private String parseDataFromString(String line, int from) { int ind = line.indexOf('\t', from); if (ind == -1) { return new String(); } return line.substring(from, ind); }

As you can see in parseNameFromString there is even a frontal translation of the String to an int, but this clumsy version works faster than Integer.valueOf (verified through the profiler)

As a result, this update option in the database worked ten times faster than the initial one. Those. 43,000 lines update this algorithm did in about 1.5-2 minutes on HTC Desire

But, this result could not satisfy me. It’s not very cool for a program user to wait 2 minutes while we insert the necessary language into the database.

With Java, one could still poshamanit, but the speed could not be increased by orders of magnitude, because the results of the profiler run on the latest version of the code clearly showed that now the longest is done inside the native_execute () method from the source file Android SDK \ sources \ android- 14 \ android \ database \ sqlite \ SQLiteStatement.java. This is a native c ++ method.
But along with it, we lost a lot of time in some kind of magical method logTimeStat, the need for which I did not understand, and how to disable it, too. In addition, bindings in the workflow also did not work very fast, and indeed, this is Java ... what kind of performance it can be (sarcasm, I don’t have anything bad to Java)

We write everything in C ++

In the end, I decided not to bother with Java anymore and write my insert in C ++ (as they say with blackjack and ...). To collect by means of NDK and to call it from Java through JNI .

There is one problem with this idea: where can I get sqlite for NDK? Well, actually - elementary. Take the source file sqlite c of. site and just add it entirely to your libina under the NDK.

I will not write how to collect code under NDK, because there is a lot of information on Habré, and not only on it .

A small remark about the inclusion of SQLite sources in your library. While googling this topic, I found the official Google group of android developers (unfortunately the links were not preserved), on which they discussed options for working with the database through their native libs, in which some of their own SQLite versions were included (the versions are different). So there the official people from Android were not very approving of this practice, they said that in theory this could spoil the database, because on the device itself there could be some other version of SQLite, and that working from its own, and then from Java already standard means, with the same base, you can break it. But in my practice, the breaking of the base happened only when the program was forcibly terminated, at the time when my lib was updating the contents of the base. This case is rare, because it does not make the program for long. But for me this is not a critical case, because the base always lies in my asset and I can restore it at any fault, and ask the user to download the languages again and roll them again.
So, in the case when you need to work with SQLite databases from native code, people from Android advise you to work only from it, without touching these Java bases with Android tools. In this case, everything is guaranteed to be well with the databases, since you will work with them only with the version of SQLite that you have.

Let's go back to the story.
I decided to write my update. SQLite dragged. The C ++ code itself repeats all the ideas that have already been improved in Java code.
Also, this article helped me a lot. It describes in great detail how to increase the speed of insertion.

Coda got a lot, who are interested, can see

C ++ code

 std::vector<std::string>& split(const std::string &s, char delim, std::vector<std::string> &elems) { elems.clear(); std::stringstream ss(s); std::string item; while (std::getline(ss, item, delim)) { elems.push_back(item); } return elems; } std::string prepareUpdateStatment(std::string columnName, std::string columnValue, std::string id) { std::ostringstream constructor; constructor << "update cities set " << columnName << " = \"" << columnValue << "\" where _id = " << id; return constructor.str(); } std::string prepareUpdateStatmentForBind(std::string columnName) { std::ostringstream constructor; constructor << "update cities set " << columnName << " = ? where _id = ? "; return constructor.str(); } std::string getColumnName(std::string column, std::string lang) { std::ostringstream constructor; constructor << lang << "_" << column; return constructor.str(); } std::string parseInt(int i) { std::ostringstream ss; ss << i; return ss.str(); } bool pushToDBWithPreparedStatments(std::string line, sqlite3* db, std::string lang, int lineNum, sqlite3_stmt* stmtnUpdateName, sqlite3_stmt* stmtnUpdateSort, sqlite3_stmt* stmtnUpdateData) { if (line.size() == 0) { return true; // end of file } int error = SQLITE_OK; std::vector<std::string> elems; elems = split(line, '\t', elems); if (elems.size() < 2) { log("line parse error"); return false; } std::string& idStr = elems[0]; int idInt = atoi(idStr.c_str()); std::string& nameStr = elems[1]; sqlite3_bind_text(stmtnUpdateName, 1 , nameStr.c_str(), -1, SQLITE_STATIC); sqlite3_bind_int(stmtnUpdateName, 2 , idInt); if ((error = sqlite3_step(stmtnUpdateName)) != SQLITE_DONE) { logError(error, sqlite3_errmsg(db)); return false; } sqlite3_clear_bindings(stmtnUpdateName); sqlite3_reset(stmtnUpdateName); sqlite3_bind_int(stmtnUpdateSort, 1 , lineNum); sqlite3_bind_int(stmtnUpdateSort, 2 , idInt); if ((error = sqlite3_step(stmtnUpdateSort)) != SQLITE_DONE) { logError(error, sqlite3_errmsg(db)); return false; } sqlite3_clear_bindings(stmtnUpdateSort); sqlite3_reset(stmtnUpdateSort); if (elems.size() == 3) { std::string& DataStr = elems[2]; sqlite3_bind_text(stmtnUpdateData, 1 , DataStr.c_str(), -1, SQLITE_STATIC); sqlite3_bind_int(stmtnUpdateData, 2 , idInt); if ((error = sqlite3_step(stmtnUpdateData)) != SQLITE_DONE) { logError(error, sqlite3_errmsg(db)); return false; } sqlite3_clear_bindings(stmtnUpdateData); sqlite3_reset(stmtnUpdateData); } return true; } void parseAndUpdateDB(std::string databasePath, std::string patchPath, std::string lang) { time_t beforeStartTime = time(NULL); sqlite3* db; if (sqlite3_open_v2(databasePath.c_str(), &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_NOMUTEX | SQLITE_OPEN_PRIVATECACHE, NULL) != SQLITE_OK) { logError("Error wile opening db", sqlite3_errmsg(db)); return; } std::string line; std::ifstream myfile(patchPath.c_str()); if (!myfile.is_open()) { log("Error wile opening patch file"); return; } int lineNum = 0; int error = SQLITE_OK; // Begin transaction if ( (error = sqlite3_exec(db, "begin", NULL, NULL, NULL)) != SQLITE_OK) { logError(error, sqlite3_errmsg(db)); return; } sqlite3_stmt* stmtnUpdateName; std::string updateName = prepareUpdateStatmentForBind(getColumnName("name", lang)); if ( (error = sqlite3_prepare(db, updateName.c_str(), updateName.length(), &stmtnUpdateName, NULL)) != SQLITE_OK) { logError(error, sqlite3_errmsg(db)); return; } sqlite3_stmt * stmtnUpdateSort; std::string updateSort = prepareUpdateStatmentForBind(getColumnName("sort", lang)); if ( (error = sqlite3_prepare(db, updateSort.c_str(), updateSort.length(), &stmtnUpdateSort, NULL)) != SQLITE_OK) { logError(error, sqlite3_errmsg(db)); return; } sqlite3_stmt * stmtnUpdateData; std::string updateData = prepareUpdateStatmentForBind(getColumnName("data", lang)); if ( (error = sqlite3_prepare(db, updateData.c_str(), updateData.length(), &stmtnUpdateData, NULL)) != SQLITE_OK) { logError(error, sqlite3_errmsg(db)); return; } // For fast work sqlite3_exec(db, "PRAGMA synchronous = OFF", NULL, NULL, NULL); sqlite3_exec(db, "PRAGMA journal_mode = MEMORY", NULL, NULL, NULL); while ( myfile.good() ) { std::getline(myfile, line); if (!pushToDBWithPreparedStatments(line, db, lang, lineNum++, stmtnUpdateName, stmtnUpdateSort, stmtnUpdateData)) { break; } } sqlite3_finalize(stmtnUpdateName); sqlite3_finalize(stmtnUpdateSort); sqlite3_finalize(stmtnUpdateData); // End transaction if ( (error = sqlite3_exec(db, "end", NULL, NULL, NULL)) != SQLITE_OK) { logError(error, sqlite3_errmsg(db)); return; } sqlite3_close(db); myfile.close(); time_t afterFinishTime = time(NULL); int result = afterFinishTime- beforeStartTime; log("result of run is %d secs" , result); }

By the way, I wrote and debugged this code under Windows in Visual Studio, and then, I collected it under NDK, and it all worked in a magical way under Android.
The version of the code is not final, so you should not find fault with some not the best solutions. The essence of the code is to show how to do the same in Java as in C ++, but only in C ++ it will work many times faster.

So about the speed.

The same insertion of 43,000 lines, on the same HTC Desire under debugging (with Eclipse connected), worked about 43 seconds. Those. where is one line in 1 ms. If you cut off Eclipse and debug, you get a truly quick result in the region of 20-25 seconds. On more powerful devices like the HTC One S, the insertion process generally took about 10-15 seconds. What, in comparison with the original minutes, shows that all efforts to improve performance were not made in vain.

Morality

Using the example of my task, I showed how to speed up work with SQLite when developing for Android (up to the transition to the native level). I do not argue that there are plenty of options to do the same thing, but I think this information will help someone to make their applications even more responsive and quick.

UPD:
Thanks to_climb , mentioned another obvious improvement. To stick together 3 requests for updates in 1N request.
Those. not 3, but one long look
update table set var1 =?, var2 =? where _id =?

Source: https://habr.com/ru/post/190876/

All Articles