The reason for the emergence of a series of articles, the first of which was presented to your attention, was a great analytical and practical material that has accumulated in the process of working on the MSLibrary for iOS library. The
MSLibrary library includes many classes, and even more functions and macros designed to simplify the routine work of developers, significantly reduce development time and code size. But, everything has its time, we will tell about the library a little later.
So, capture and verification of phone numbers using regular expressions. It would seem, what is there to talk about? Who knows how, he writes himself, and who does not know how to copy one of the many ready-made solutions, scattered in the vast World Wide Web. The only question is what he will write and what will be copied and how will this code correspond to the tasks set, as well as to the current international, industry and corporate standards? Any solution, even the simplest, is good only if the developer is fully aware of his work and is absolutely sure of it.
Any solution, even the simplest, is good only if the developer is fully aware of his work and is absolutely sure of it.
In fact, the topic of the article is divided into two already in the title. Capture is the ability to understand that a set of digits and characters to be tracked can be a telephone number. And verification is a test of whether this set meets certain pre-set conditions. It is easy to verify that these tasks are different by looking at the above example:
+ 1 (408) - 996 - 10 - 10 = 1234
+14089961010; ext = 1234
In the first case, someone tried to record the phone in the manner in which he was used to doing it, and in the second, the phone was recorded in accordance with the international standard
RFC 3966 . The problem is that if we try to use both of these entries to dial a phone number, for example, in an iOS application, then, unfortunately, we will not get anything good. In the first case, the system will not understand anything at all, and in the second, instead of the extension number “1234”, the system will dial completely different numbers (this is a very convincing experiment, you can try. The code is given below).
')
A simple code for a phone call from an iOS application.NSString *telString =@"tel:+14089961010;ext=1234"; NSURL *urlString = [NSURL URLWithString: telString ]; [[UIApplication sharedApplication] openURL:urlString];
In essence, both tasks (capture and verification) are solved by the same methods, the only difference is in the applied regular expressions.
The first step is to look at what is written in
RFC 3966 , which governs this issue. And it is written there in a highly simplified form the following:
Simplified structure of telephone-uri in accordance with RFC 3966
telephone-uri = "tel:" telephone-subscriber
telephone-subscriber = global-number
global-number = global-number-digits * par
par = extension | isdn-subaddress | parameter
isdn-subaddress = "; isub =" 1 * uric
extension = "; ext =" 1 * phonedigit
global-number-digits = "+" * phonedigit DIGIT * phonedigit
parameter = ";" pname ["=" pvalue]
pname = 1 * (alphanum | "-")
pvalue = 1 * paramchar
paramchar = param-unreserved | unreserved | pct-encoded
unreserved = alphanum | mark
mark = "-" | "_" | "." | "!" | "~" | "*" "'" | "(" | ")"
param-unreserved = "[" | "]" | "/" | ":" | "&" | "+" | "$"
phonedigit = DIGIT | [visual-separator]
visual-separator = "-" | "." | "(" | ")"
alphanum = ALPHA | Digit
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | ","
uric = reserved | unreserved
where ALPHA and DIGIT, as follows from another document, are
RFC 2396 :
DIGIT = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
ALPHA = lowalpha | upalpha
lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
upalpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
Schematically telephone-uri can be represented as follows:
telephone-uri = [ ]
rice oneFrom RFC 3966, it follows that:
Where:
extension - extension phone number
isdn-subaddress - ISDN subaddress
parameter - some other optional parameters
Since in real life, when working with mobile applications, in particular with iOS applications, and with most sites, there is only an extension telephone number, the telephone-uri scheme will change as follows:
telephone-uri = global-number-digits [extension]
rice 2or by substituting the corresponding values ​​for global-number-digits and extension:
telephone-uri = "+" *phonedigit DIGIT *phonedigit [";ext=" 1*phonedigit]
rice 3where phonedigit consists of the digits "[0-9]" and a limited range of visual delimiters
"-" | "." | "(" | ")"
Here, it would seem, it is possible to proceed to the consideration of regular expressions implementing this scheme. but ... Many authors rightly note that not any set of numbers is a telephone number. There is a generally accepted international practice that defines the structure of telephone numbers. Yes, there really is, and we will consider it briefly now.
International structure of telephone numbers
In accordance with existing practice, the following groups can be distinguished, which together make up the telephone number of the global-digits number:
global-number-digits
"+" Is a sign indicating that the international country code is located after it
country_code - one-three-digit country code, for example: 7, or 44 or 374
area_code - a three-digit region code, for example: 800
exchange - a three-digit station number, for example: 555
subscriber_number - four-digit subscriber number, for example: 1234
extension
In this case, the structure of the phone number will look like this:
telephonNumber = "+" country_code [visual-separator] area_code [visual-separator] exchange [visual-separator] subscriber_number [visual-separator] [";ext=" extension]
rice fourThus, if we implement a regular expression that takes into account the standard RFC 3966 and the International Telephone Number Structure, in other words, it corresponds to the diagrams shown in Fig. 3 and fig. 4, the following telephone number entries are quite valid:
+14089961010; ext = 1234
+1 (408) 996-1010; ext = 1234
+ 1-408-996-10-10; ext = 1234
+1.408.996.1010; ext = 1234
and the following will be valid only on the basis of the RFC 3966 standard, since the visual separators (visual separator) are outside the structure defined by the International Standards for Telephone Numbers:
+1 (4089) 96-1010; ext = 1234
+ 1-408996-10-10; ext = 1234
+1.408996.1010; ext = 1234
In real life, the user can dial the phone number either according to the pattern specified in the application or on the site, or as he thinks is correct (that is, as he pleases). And the application, in turn, will process the resulting string in accordance with the standards laid down in it. What is not always the same.
In real life, the user can dial the phone number either according to the pattern specified in the application or on the site, or as he thinks is correct (that is, as he pleases). And the application, in turn, will process the resulting string in accordance with the standards laid down in it. What is not always the same.
Corporate WEB telephone dialing standards
Why web? Because in iOS, and practically in all other systems, and, of course, on websites, the easiest way to make a phone call is to use a well-known html scheme:
<a href="tel:1-408-996-1010">1-408-996-1010</a>
rice fiveAt the beginning of the article, a sample of the code for implementing this scheme in Objective-C has already been given, however, for a more elegant presentation, it is worth repeating:
NSString *telString =@"tel:+14089961010"; NSURL *urlString = [NSURL URLWithString: telString ]; [[UIApplication sharedApplication] openURL:urlString];
rice 6Let us turn to corporate standards:
What do Google
experts say about this?
Always supply the phone number using the international dialing format: the plus sign (+), country code, area code and number. If you’re not absolutely necessary, it’s a good idea.
That is, it is necessary to put the "+" sign in front of the country code and "it may be a good idea" to separate the segments (groups) of the telephone number with visual separators in the form of hyphens "-". Which, in general, is in complete agreement with RFC 3966 and the International Telephone Numbers Structure, in other words, corresponds to the diagrams shown in Fig. 3 and fig. 4. However, there is one significant BUT. Visual separators are limited to one character, a hyphen "-". This of course does not mean that the browser will inadequately respond to brackets or dots as visual separators, but this requires careful verification, the Google document guarantees only a hyphen. In addition, this section of the manual does not say anything about the format of the extension. Since the article is mainly devoted to iOS applications, we will not delve into the specifics of the work of Google software, you can experiment and tell about the results.
Apple is even more laconic, in the
Phone Links section, about the permissible format of a phone number, we find this phrase:
For more information about the URL URL scheme, see RFC 2806 and RFC 2396.
Formally, everything is correct, why repeat, if there are international standards? But the fact is that Apple’s corporate standards, as well as Google’s, only meet parts of these international standards, which, by the way, are often advisory in nature.
Corporate standards correspond only to parts of international standards, which, by the way, are often advisory in nature.
What can and can not be used in the phone number for the iOS application
Experiments have shown that the system responds adequately to all four types of visual separators regulated in RFC 3966, namely:
Moreover, the separators can be located in arbitrary locations and, moreover, can be unpaired. As an example, the following phone number entries will be processed by the code shown in Fig. 6, equally and adequately:
+14089961010
+1(408)996-1010
+ 1-408-996-10-10
+1.408.996.1010
+1 (4089) 96-1010
+ 1-408996-10-10
+1.408996.1010
+ 1-4 (0899 (6-10-10
+1.40) 8996.10 (10
The situation is different with extension dialing. As mentioned at the beginning of the article, the system does not respond correctly to the prefix of the additional code "; ext =" proposed in RFC 3966. Native, that is, natural for the system are two characters: ";" and ",".
In the first case, when the system encounters a separator ";" in the telephone number, the dialing stops and the extension numbers appear on the screen. When you click on them, the set continues.
In the second case, when the system encounters a separator "," in the phone number, dialing stops at this place and automatically continues after a short pause of approximately 2 seconds. Visual separator "," may not be single, for example, ",,,,". In this case, the length of the pause increases in proportion to the number of characters.
In the case of a separator of the form "; ext =", regulated by RFC 3966, the following occurs: the system accepts the ";" for the breakpoint before dialing the extension number, and interprets the ext characters as digits from which the extension number begins.
So, we have considered the theoretical and practical prerequisites and can go to the second part of the article, that is, in fact, to regular expressions for capturing and verifying telephone numbers.
We hope that the material was useful to you, the
MSLibrary for iOS team
Other articles:
Capturing and verifying phone numbers using regular expressions, for iOS and not only ... Part 2Implementing multiple selection of conditions using bitmasks, for iOS and not only ...SIMPLE: remove unnecessary characters from the string, for iOS and not only ...Creating and compiling cross-platform (universal) libraries in Xcode