Looking today in the address book of my Android phone, I realized that it was inconvenient for me to look at my list of contacts, sorted in order of characters in UTF.
Here is the list I see:
- John smith
- Marcus wolf
- Semen slepakov
- William Shakespear
- Zorro
- Alexander Pushkin
- Ivan Barkov
- Jacob Perelman
')
But what list would I like to see if the phone locale is Russian:
- Alexander Pushkin (A)
- William Shakespear (V)
- John Smith (D)
- Zorro (W)
- Ivan Barkov (I)
- Marcus Wolf (M)
- Semen Slepakov (C)
- Jacob Perelman (I)
But such a list - if the locale is English:
- Alexander Pushkin (A)
- Ivan Barkov (sound i: *)
- John Smith (J)
- Marcus Wolf (M)
- Semen Slepakov (S)
- William Shakespear (W)
- Jacob Perelman (YA **)
- Zorro (Z)
The idea is clear, yes? Sorting follows phonetic rules, and I add a letter / sound value to each element of the list, which determines the position of the element in the list.
*) It's not entirely clear where to put Ivan in English sorting. Sound [i:] - sorted as an English letter E or, nevertheless, as I?
**) Same story with Jacob. Which transliteration option to use - YA or JA?
Something like that, it seems, appears. For example, on Facebook, you can start typing either in Russian or in English in the search field for friends, and “there is everything”
I suspect that using the transliteration rules would be the simplest implementation. Those. All elements of the list are converted to Cyrillic or Latin, depending on the locale, and then sorted. But there are some problems like those mentioned above. Very often the literal (or sonorous) transliteration does not reflect the pronunciation of the word. The path based on something like
en.wikipedia.org/wiki/International_Phonetic_Alphabet seems promising, but not the fact that there is an order of characters in this very IPA.
Actually, now the questions:
1. Are there already such algorithms in the form of libraries? If there is, add links, please (for example, for Java).
2. Let's try to distribute a list of problems that will have to face if you do it yourself. And, preferably, solutions to these problems.