📜 ⬆️ ⬇️

Convert string to number

He helped one of his friends one of these days to understand programming. Along the way, they wrote a training program that can convert a string (string) into a number (int). And somehow by itself I wanted to compare the speed of the work of my own nettle, with the speed of the work of standard tools (Convert.ToInt32 and Int32.Parse). The result of this comparison, at first glance, turned out to be somewhat unusual.

I think that any of you will be able to solve the problem without any special problems, therefore, I don’t see how and what makes sense to work.
class MyConvert { private static int CharToInt(char c) { return c - '0'; } public static int ToInt(string s) { if (s == null) throw new ArgumentException(); if (s.Length == 0) throw new ArgumentException(); bool isNegative = false; int start = 0; switch (s[0]) { case '-': if (s.Length == 1) throw new ArgumentException(); start = 1; isNegative = true; break; case '+': if (s.Length == 1) throw new ArgumentException(); start = 1; break; } int result = 0; for (int i = start; i < s.Length; i++) { if (c < '0' || c > '9') throw new ArgumentException(); result = checked(result * 10 + CharToInt(s[i])); } return (isNegative) ? -result : result; } } 


We will compare the speed of the following functions:
  static void MyConvertTest(int numbersCount) { for (int i = - numbersCount; i < numbersCount; i++) { if (i != MyConvert.ToInt(i.ToString())) { throw new ArgumentException(); } } } static void ConvertTest(int numbersCount) { for (int i = - numbersCount; i < numbersCount; i++) { if (i != Int32.Parse(i.ToString())) { throw new ArgumentException(); } } } 


On my machine (Windows 7, .NET 4.0, intel i5) with numbersCount = 16,000,000, the following results are on average:
ConvertTest: 08.5859994 seconds
MyConvertTest: 07.0505985 seconds
')
If instead of Int32.Parse to use Convert.ToInt32, then the result will not change. But this is understandable, given that the function Convert.ToInt32 itself calls Int32.Parse. Thus, we find that the speed of the own bicycle is faster than the speed of the standard function by ~ 18%.

If you look at the documentation , it becomes clear that the function Int32.Parse is quite complicated. In particular, it is able to convert the string representation of a number according to regional formats. Although in my practice I did not have to use this functionality.

Let's try to make our creation a little more faster. To do this, in the ToInt function, change the cycle
  for (int i = start; i < s.Length; i++) 

on
  int length = s.Length; for (int i = start; i < length; i++) 


In this case, we obtain:
MyConvertTest: 06.2629928 seconds
That is, now the speed of our function is faster than the standard by ~ 27%. This is quite unexpected. I thought that the compiler (or the CLR) will be able to understand that since we do not change the variable s inside the loop, the value of s.Length will be received only once.

Now let's try instead of calling the function CharToInt, embed its body into the function ToInt. In this case
MyConvertTest: 05.5496214 seconds
Thus, the speed of work relative to the standard function increased by ~ 35%. This, in turn, is quite unexpected, since the compiler (or CLR) in some cases can do this on its own.

Already almost all tried. It remains only to try to abandon the for loop. This can be done, for example, as follows:
  unsafe public static int ToInt(string s) { if (s == null) { throw new ArgumentException(); } int result = 0; bool isNegative = false; fixed(char* p = s) { char* chPtr = p; char ch = *(chPtr++); switch (ch) { case '-': isNegative = true; ch = *(chPtr++); break; case '+': ch = *(chPtr++); break; } do { if (ch < '0' || ch > '9') { throw new ArgumentException(); } result = result * 10 + (ch - '0'); ch = *(chPtr++); }while (ch != '\0'); } return (isNegative) ? -result : result; } 

Result:
MyConvertTest: 05.2410683 seconds
This is ~ 39% faster than the standard function (and only 3% faster than the option with for).

findings


Of course, attempts to increase the speed of the line-to-number function are of no particular value. I can not imagine a single situation in which such a performance increase would have at least some significant impact. Nevertheless, some quite obvious conclusions can be made:


Quite often in various literature there are quite fair assertions that all low-level optimization should be placed on the shoulders of the compiler and the runtime. They are supposedly quite intellectual and know better how they will be better. However, as such tests show, optimization of cycles and deployment of functions can result in a 20% increase in speed (if we compare the 1st option).

Of course, you need to engage in optimization wisely. And far from always in real projects, the complication of project maintenance can be justified by speeding up work for a few seconds.

UPD fixed a problem with a plus, thanks.
UPD 2 it4_kp correctly noted that in all the algorithms proposed by me does not work
 MyConvert.ToInt( int.MinValue.ToString() ) 

Indeed, int.MinValue is modulo one more than int.MaxValue. And since in the intermediate calculation the positive range from int is used as a buffer, the module from int.MinValue is not included in it.
One possible solution: use negative range as an intermediate buffer.
  unsafe public static int ToInt(string s) { .... result = result * 10 - (ch - '0'); ..... return (isNegative) ? result : checked(-result); } 


Already two errors in the seemingly primitive code. An additional reason to think before you abandon the standard libraries.

Source: https://habr.com/ru/post/150339/


All Articles