📜 ⬆️ ⬇️

T9 do it yourself

Hi, Habr!

Typing SMS or just text on a mobile phone for a long time I wanted to do the T9 algorithm myself.

The algorithm is clear, but all did not reach out.
')
Today, nevertheless, I got together and managed to do it.

So the first thing you need to formulate what I wanted to get.

Task : Make an analogue of a mobile phone keypad and dial the numbers to get a list of words.
Restriction : No special restrictions were made, all the same work for just for fun, but I want it to be in the region of 1 second parsing.
As a feature : Support languages.

As a base of words, a file with 26,000 words was taken.

Now the algorithm itself:

My algorithm is conventionally divided into 2 parts.
Part 1 - selects all words that begin with one of the letters corresponding to the number.
That is, for the number 3, all words starting with d, e, f are selected.

Part 2 - consistently go through all the numbers and sift out the wrong words.

Language support :
I think everyone understands that language support is carried out by replacing letters under numbers.
For this implementation will show below.

As a demonstration, jquery and phone button emulation are used.

You can see here

Now the class itself on PHP:
class T9_Exception extends Exception {
}

abstract class T9 {
protected $wordlist = array ( ) ; //
protected $enum = array ( ) ; //
protected $mb_support = true ; //

public function mb_support ( $status = true ) {
$this -> mb_support = $status ;
}
public function __construct ( $filename = null ) {
if ( $filename )
$this -> load ( $filename ) ; //
}
public function load ( $filename ) {
if ( file_exists ( $filename ) && is_readable ( $filename ) ) {
$this -> wordlist = file ( $filename ) ;
//array_walk( $this->wordlist, create_function('&$w', '$w=trim($w);')); //
return ;
}
throw new T9_Exception ( "Can not read $filename " , 100 ) ;

}
/**
* Fetch data from wordlist
* @param string $input
* @return array
*/

public function fetch ( $input ) {
$compare = array ( ) ;
$total = count ( $this -> wordlist ) ;
for ( $i = 0 ; $i < $total ; $i ++ ) {
$len = strlen ( $this -> enum [ $input [ 0 ] ] ) ;
for ( $j = 0 ; $j < $len ; $j ++ ) {
if ( $this -> mb_support == true ) {
if ( mb_strpos ( $this -> wordlist [ $i ] , $this -> enum [ $input [ 0 ] ] [ $j ] ) === 0 )
$compare [ ] = $this -> wordlist [ $i ] ;
}
else {
if ( strpos ( $this -> wordlist [ $i ] , $this -> enum [ $input [ 0 ] ] [ $j ] ) === 0 )
$compare [ ] = $this -> wordlist [ $i ] ;
}

}
}
for ( $i = 1 ; $i < strlen ( $input ) ; $i ++ ) {
$found = false ;
$newcompare = array ( ) ;
for ( $k = 0 ; $k < count ( $compare ) ; $k ++ ) {
for ( $j = 0 ; $j < strlen ( $this -> enum [ $input [ $i ] ] ) ; $j ++ ) {
$letter = $this -> enum [ $input [ $i ] ] [ $j ] ;
if ( $this -> mb_support == true ) {
if ( mb_strtolower ( $compare [ $k ] [ $i ] ) == $letter )
$newcompare [ ] = $compare [ $k ] ;
}
else {
if ( $compare [ $k ] [ $i ] == $letter )
$newcompare [ ] = $compare [ $k ] ;
}



}
}
$compare = $newcompare ;
}
return $compare ;
}
}


class T9_English extends T9 {

protected $enum = array (
0 => '' ,
1 => '' ,
2 => 'abc' ,
3 => 'def' ,
4 => 'ghi' ,
5 => 'jkl' ,
6 => 'mno' ,
7 => 'pqrs' ,
8 => 'tuv' ,
9 => 'wxyz' ,
) ;
}

class T9_Russian extends T9 {
protected $enum = array (
0 => '' ,
1 => '' ,
2 => '' ,
3 => '' ,
4 => '' ,
5 => '' ,
6 => '' ,
7 => '' ,
8 => '' ,
9 => '' ,

) ;
}


UPD : some bytes were missing during the output, so the source was also posted on this link:
http://anton.in.ua/demo/t9/t9.txt

I did not check the Russian dictionary, but judging by the algorithm it should work the same way.

Here is such a small warm-up for the mind.

Thank you all for your attention.

Source: https://habr.com/ru/post/86877/


All Articles