T9 do it yourself

Hi, Habr!

Typing SMS or just text on a mobile phone for a long time I wanted to do the T9 algorithm myself.

The algorithm is clear, but all did not reach out.
')
Today, nevertheless, I got together and managed to do it.

So the first thing you need to formulate what I wanted to get.

Task : Make an analogue of a mobile phone keypad and dial the numbers to get a list of words.
Restriction : No special restrictions were made, all the same work for just for fun, but I want it to be in the region of 1 second parsing.
As a feature : Support languages.

As a base of words, a file with 26,000 words was taken.

Now the algorithm itself:

My algorithm is conventionally divided into 2 parts.
Part 1 - selects all words that begin with one of the letters corresponding to the number.
That is, for the number 3, all words starting with d, e, f are selected.

Part 2 - consistently go through all the numbers and sift out the wrong words.

Language support :
I think everyone understands that language support is carried out by replacing letters under numbers.
For this implementation will show below.

As a demonstration, jquery and phone button emulation are used.

You can see here

Now the class itself on PHP:

class T9_Exception extends Exception { } abstract class T9 { protected $wordlist = array ( ) ; // protected $enum = array ( ) ; // protected $mb_support = true ; // public function mb_support ( $status = true ) { $this -> mb_support = $status ; } public function __construct ( $filename = null ) { if ( $filename ) $this -> load ( $filename ) ; // } public function load ( $filename ) { if ( file_exists ( $filename ) && is_readable ( $filename ) ) { $this -> wordlist = file ( $filename ) ; //array_walk( $this->wordlist, create_function('&$w', '$w=trim($w);')); // return ; } throw new T9_Exception ( "Can not read $filename " , 100 ) ; } /** * Fetch data from wordlist * @param string $input * @return array */ public function fetch ( $input ) { $compare = array ( ) ; $total = count ( $this -> wordlist ) ; for ( $i = 0 ; $i < $total ; $i ++ ) { $len = strlen ( $this -> enum [ $input [ 0 ] ] ) ; for ( $j = 0 ; $j < $len ; $j ++ ) { if ( $this -> mb_support == true ) { if ( mb_strpos ( $this -> wordlist [ $i ] , $this -> enum [ $input [ 0 ] ] [ $j ] ) === 0 ) $compare [ ] = $this -> wordlist [ $i ] ; } else { if ( strpos ( $this -> wordlist [ $i ] , $this -> enum [ $input [ 0 ] ] [ $j ] ) === 0 ) $compare [ ] = $this -> wordlist [ $i ] ; } } } for ( $i = 1 ; $i < strlen ( $input ) ; $i ++ ) { $found = false ; $newcompare = array ( ) ; for ( $k = 0 ; $k < count ( $compare ) ; $k ++ ) { for ( $j = 0 ; $j < strlen ( $this -> enum [ $input [ $i ] ] ) ; $j ++ ) { $letter = $this -> enum [ $input [ $i ] ] [ $j ] ; if ( $this -> mb_support == true ) { if ( mb_strtolower ( $compare [ $k ] [ $i ] ) == $letter ) $newcompare [ ] = $compare [ $k ] ; } else { if ( $compare [ $k ] [ $i ] == $letter ) $newcompare [ ] = $compare [ $k ] ; } } } $compare = $newcompare ; } return $compare ; } } class T9_English extends T9 { protected $enum = array ( 0 => '' , 1 => '' , 2 => 'abc' , 3 => 'def' , 4 => 'ghi' , 5 => 'jkl' , 6 => 'mno' , 7 => 'pqrs' , 8 => 'tuv' , 9 => 'wxyz' , ) ; } class T9_Russian extends T9 { protected $enum = array ( 0 => '' , 1 => '' , 2 => '' , 3 => '' , 4 => '' , 5 => '' , 6 => '' , 7 => '' , 8 => '' , 9 => '' , ) ; }

UPD : some bytes were missing during the output, so the source was also posted on this link:
http://anton.in.ua/demo/t9/t9.txt

I did not check the Russian dictionary, but judging by the algorithm it should work the same way.

Here is such a small warm-up for the mind.

Thank you all for your attention.

Source: https://habr.com/ru/post/86877/

All Articles

T9 do it yourself

More articles: