📜 ⬆️ ⬇️

We count words in texts

Hi, Habr! Just recently, I learned that in the works of Russian literature the letter “o” is more popular than the others and instantly remembered my long-time idea to write a simple script that would make a list of the most used words from the specified text.

Sometimes it becomes necessary to read texts in English, but since my vocabulary is not so rich that I can understand everything on the fly, I often have to be distracted by the use of a dictionary. Many words are met very often, but it does not always work after the first acquaintance with the translation of the word to drive it into my head. And this wonder-top comes to the rescue. Everything is insanely simple: at the entrance, the source text, at the output is a list of N most used words, which we then hammer into the translator and get a glossary to our text.

Since only PHP left an imprint in my head, it was decided to write on it, since the only requirement for the script is the result.
  1. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  2. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  3. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  4. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  5. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  6. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  7. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  8. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  9. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  10. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  11. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  12. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  13. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  14. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  15. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  16. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  17. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  18. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  19. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  20. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  21. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  22. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  23. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  24. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  25. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  26. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  27. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  28. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  29. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  30. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  31. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  32. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  33. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  34. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  35. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  36. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }
  37. #!/usr/bin/php <?php if (!isset($argv[1])) { die( 'Usage: ./wtop filename [lines] [filter]' . PHP_EOL . 'Use -1 lines for show all words' . PHP_EOL); } if (!file_exists($argv[1]) || !is_readable($argv[1])) { die( 'Data file not found or can not be read . PHP_EOL' ); } if (isset($argv[3])) { if (file_exists($argv[3]) && is_readable($argv[3])) { $filter = str_word_count(file_get_contents($argv[3]), 1); } else { die( 'Filter file not found or can not be read' . PHP_EOL); } } $lines = (isset($argv[2])) ? ( int ) $argv[2] : -1; $data = file_get_contents($argv[1]); $words = str_word_count($data, 1); foreach ($words as $word) { $word = strtolower($word); if (isset($filter) && in_array($word, $filter)) { continue ; } if (isset($result[$word])) { $result[$word] += 1; } else { $result[$word] = 1; } } arsort(&$result); foreach ($result as $word => $count) { if ($lines-- == 0) { break ; } echo $count . ' ' . $word . PHP_EOL; }

Work example:
stream@sapphire:~/development$ cat text.txt
With PHP breaking new ground in the enterprise arena, the establishment of a rati-
fied certification was, some might say, inevitable. However, for me, it couldn't come
soon enough—and I was ecstatic when Zend launched their PHP 4 Certification.
With more than 1,500 certified engineers to date, there is no doubt that their en-
deavour has been a success.
Now, with the introduction of the long-awaited PHP 5 certification, Zend has once
again raised the bar for PHP developers everywhere. This examination is much
broader, and requires much more than just theoretical knowledge—in order to pass
the test, candidates need real-world knowledge in addition to a solid theoretical
background.
The effect of the PHP 5 certification, for me, is even more profound than that of
the original certification, and I believe that it will become the gold standard for those
looking to hire PHP-centric Web Developers. I think that it is apt to consider Zend's
work a job well done, and to applaud those who invest the time and effort needed to
become Zend Certified Engineers.
stream@sapphire:~/development$ cat filter.txt
a the
am are
i you we
stream@sapphire:~/development$ ./wtop text.txt 10 filter.txt
7 to
5 and
5 certification
5 php
4 for
4 is
4 that
4 of
4 zend
3 more
stream@sapphire:~/development$


I hope someone will be useful. Good luck!
')
Update: rewrote a small piece of code (thank DevMan), added filter support.

Source: https://habr.com/ru/post/92770/


All Articles