📜 ⬆️ ⬇️

8 useful regexpov with a visual analysis

Much has been written about the power and flexibility of regular expressions, and their use has long been a standard for various kinds of operations on text. Perhaps, most often, regexps work when validating input data - there is practically no alternative to them, except for cumbersome cyclic parsing with a bunch of unobvious checks. Let's start with the simplest:

1. CNC part (human readable URL)


In fact, the word with hyphens.

Pattern: / ^ [a-z0-9 -] + $ /
short_url
')


2. UserName


Letters, numbers, hyphens and underscores, from 3 to 16 characters.

Pattern: / ^ [a-z0-9 _-] {3.16} $ /
username

3. Password


Same as username, only from 6 to 18.

Pattern: / ^ [a-z0-9 _-] {6.18} $ /
password
From myself: more briefly - / ^ [\ w _] {6.18} $ / . Similarly for username.

4. Hex color


The # symbol (optional), then a word consisting of letters from a to f or numbers, 3 or 6 long.

Pattern: / ^ #? ([A-f0-9] {6} | [a-f0-9] {3}) $ /
hex

5. XML tag


Behind the opening bracket <there should be a word of letters - the name of the element, then there can be attributes - any characters except the closing bracket>. Further, any text (content) and closing tag, i.e. <name />, or at least one space, slash, and closing bracket (self-closing tag).

Pattern: /^<((azaz+)([>>>+)*(?:>(.*)<\/\1>|\s+\/>)$
xml_tag

6. Email


General view - login @ subdomain . domain Login, as well as a subdomain - words from letters, numbers, underscores, hyphens and periods. And a domain (meaning 1st level) is from 2 to 6 letters and dots.

Pattern: /^([a-z0-9_\.-[+)@([a-z0-9_\.-[+)\.([az\.[[2,6,6 [ / i])
email
From myself: you can be shorter - / ^( ( \\\\___ )+)@\1\ .( [azERT22,6 . ??)$/ . It is also slightly more correct - a point in a first-level domain can only occur once and only at the end.

7. URL


First of all, an optional protocol (http: // or https: //), then a sequence of letters, numbers, hyphens, underscores and dots (level domains> 1), then a zero level domain (from 2 to 6 letters and dots) and, Finally, the file structure is a set of words from letters, numbers, hyphens, underscores and periods with a slash at the end. All this can end with a slash again.

Pattern: /^ ( https? : \/\/)?([\da-z\.-_++\\. ( [az\. [> A2,6] [/ i]) ;( [[/] /\w \ .-] *) * \ /? $ /
url
From myself: it's better this way - / ^ (https?: \ / \ /)? ([\ W \.] +) \. ([Az] {2,6} \.?) (\ / [\ W \. ] *) * \ /? $ /

8. IP address


4 groups of numbers (from 1 to 3 numbers in each) are separated by dots. If the group consists of 3 characters, then the first one is 1 or 2; if 1, then the rest from 0 to 9, and if 2 - then the second from 0 to 5; if the second character is from 0 to 4, then the third is from 0 to 9, and if the second is 5, then the third is from 0 to 5. If the group consists of 2 characters, then the first is from 1 to 9, the second is from 0 to 9 In the case of a single-character group, this symbol can be a number from 1 to 9.

Pattern: / ^( ? :(?:25[0-5 mine ||2[0-4-40 } (?: 25 [0-5] | 2 [0-4] [0-9] | [01]? [0-9] [0-9]?) $ /
ip
From myself: in my opinion, it is more correct - /^ ( ?:(?:25 : 2500-5||[[ - 4-4\\dd0101_ ??ddd ?) (?: 25 [0-5] | 2 [0-4] \ d | [01]? \ D \ d?) $ / .

Taken from here

Source: https://habr.com/ru/post/66931/


All Articles