πŸ“œ ⬆️ ⬇️

Expressive JavaScript: Regular Expressions

Content




Some people, faced with a problem, think: "Oh, and I use regular expressions." Now they have two problems.
Jamie Zavinsky

Yuan-Ma said: β€œIt takes a lot of power to cut wood across the wood structure. It takes a lot of code to program across the problem structure.
Master Yuan-Ma, "Programming Book"
')
Programming tools and techniques survive and spread in a chaotic evolutionary way. Sometimes not beautiful and ingenious ones survive, but simply those who work quite well in their field - for example, if they are integrated into another successful technology.

In this chapter, we will discuss this tool - regular expressions. This is a way to describe patterns in string data. They create a small separate language that is included in JavaScript and in many other languages ​​and tools.

Regulars are both very strange and extremely useful. Their syntax is mysterious, and the program interface in JavaScript is awkward for them. But it is a powerful tool for researching and processing strings. Having dealt with them, you will become a more effective programmer.

Create a regular expression


Regular - the type of object. It can be created by calling the RegExp constructor, or by writing the desired template, surrounded by slashes.

var re1 = new RegExp("abc"); var re2 = /abc/; 


Both of these regular expressions represent one pattern: the character β€œa” followed by the character β€œb” followed by the character β€œc”.

If you use the RegExp constructor, then the pattern is written as a regular string, so all the rules regarding backslashes apply.

The second entry, where the template is between slashes, handles backslashes differently. First, since the pattern ends with a forward slash, you need to put a backslash in front of the forward slash, which we want to include in our pattern. In addition, backslashes that are not part of special characters like \ n will be saved (and not ignored, as in strings), and will change the meaning of the pattern. Some characters, such as a question mark or plus sign, have a special meaning in regulars, and if you need to find such a character, it must also be preceded by a backslash.

 var eighteenPlus = /eighteen\+/; 


To know which characters you need to precede with a slash, you need to learn the list of all the special characters in the regulars. While this is unrealistic, so if in doubt, just put a backslash in front of any character that is not a letter, number or space.

Check for matches


Regularists have several methods. The simplest is test. If you pass it a string, it returns a Boolean value, telling whether the string contains an occurrence of the specified pattern.

 console.log(/abc/.test("abcde")); // β†’ true console.log(/abc/.test("abxde")); // β†’ false 


Regular, consisting only of non-special characters, simply represents a sequence of these characters. If abc is somewhere in the string that we are checking (not only at the beginning), test returns true.

We are looking for a set of characters


It would be possible to find out whether the string contains abc, using indexOf. Regulars allow you to go further and create more complex patterns.

Suppose we need to find any number. When we put a set of characters in square brackets in the regular box, this means that this part of the expression matches any of the characters in the brackets.

Both expressions are in the lines containing the number.

 console.log(/[0123456789]/.test("in 1992")); // β†’ true console.log(/[0-9]/.test("in 1992")); // β†’ true 


In square brackets, a dash between two characters is used to specify a range of characters, where the sequence is specified in Unicode. Characters from 0 to 9 are there just in a row (codes from 48 to 57), so [0-9] captures them all and matches any digit.

Several groups of characters have their own inline abbreviations.

\ d Any number
\ w Alphanumeric character
\ s Space character (space, tab, newline, etc.)
\ D is not a digit
\ W is not an alphanumeric character
\ S is not white space
. any character except newline

Thus, you can set the date and time format like 30-01-2003 15:20 with the following expression:

 var dateTime = /\d\d-\d\d-\d\d\d\d \d\d:\d\d/; console.log(dateTime.test("30-01-2003 15:20")); // β†’ true console.log(dateTime.test("30-jan-2003 15:20")); // β†’ false 


It looks awful, right? Too many backslashes that make it difficult to understand the pattern. Later we will slightly improve it.

Backslashes can also be used in square brackets. For example, [\ d.] Means any number or point. Notice that the point inside the square brackets loses its special meaning and turns into just a point. The same applies to other special characters, such as +.

You can invert the character set β€” that is, say that you need to find any character except those in the set β€” by putting a ^ immediately after the opening square bracket.

 var notBinary = /[^01]/; console.log(notBinary.test("1100100010100110")); // β†’ false console.log(notBinary.test("1100100010200110")); // β†’ true 


Repeat parts of the pattern


We know how to find one digit. And if we need to find a whole number - a sequence of one or more digits?

If you put a + sign after something in a regular schedule, it will mean that this element can be repeated more than once. / \ d + / means one or more numbers.

 console.log(/'\d+'/.test("'123'")); // β†’ true console.log(/'\d+'/.test("''")); // β†’ false console.log(/'\d*'/.test("'123'")); // β†’ true console.log(/'\d*'/.test("''")); // β†’ true 


The asterisk * value is almost the same, but it allows the pattern to be present zero times. If there is an asterisk after something, then it never prevents the template from being in a line β€” it is simply there zero times.

The question mark makes part of the pattern optional, that is, it may occur zero or once. In the following example, the u character may occur, but the pattern is the same when it does not exist.

 var neighbor = /neighbou?r/; console.log(neighbor.test("neighbour")); // β†’ true console.log(neighbor.test("neighbor")); // β†’ true 


To specify the exact number of times the pattern should occur, use curly braces. {4} after the element means that it must occur 4 times in a row. You can also specify the interval: {2,4} means that the element must occur at least 2 and not more than 4 times.

Another version of the date and time format, where days, months and hours are allowed in one or two digits. And it is a bit more readable.

 var dateTime = /\d{1,2}-\d{1,2}-\d{4} \d{1,2}:\d{2}/; console.log(dateTime.test("30-1-2003 8:45")); // β†’ true 


You can use open-end gaps by omitting one of the numbers. {, 5} means that the pattern can occur from zero to five times, and {5,} - from five or more.

Grouping subexpressions


To use the * or + operators on several elements at once, you can use parentheses. The part of the regulars in parentheses is considered one element from the point of view of operators.

 var cartoonCrying = /boo+(hoo+)+/i; console.log(cartoonCrying.test("Boohoooohoohooo")); // β†’ true 


The first and second pluses refer only to the second letters o in the words boo and hoo. The third + refers to the whole group (hoo +), finding one or more of these sequences.

The letter i at the end of the expression makes the regularity case-insensitive, so that B coincides with b.

Matches and groups


The test method is the easiest method for checking regulars. He only reports whether a match was found or not. Regulars have another exec method that returns null if nothing was found, and otherwise returns an object with information about a match.

 var match = /\d+/.exec("one two 100"); console.log(match); // β†’ ["100"] console.log(match.index); // β†’ 8 


The returned exec object has an index property, which contains the number of the character from which the match occurred. In general, an object looks like an array of strings, where the first element is a string that was checked for a match. In our example, this will be the sequence of numbers we were looking for.

Strings have a match method that works much the same way.

 console.log("one two 100".match(/\d+/)); // β†’ ["100"] 


When the regular expression contains subexpressions grouped by parentheses, the text that matches these groups will also appear in the array. The first element is always a complete match. The second is the part that matched the first group (the one who had the parentheses met before everyone else), then the second group, and so on.

 var quotedText = /'([^']*)'/; console.log(quotedText.exec("she said 'hello'")); // β†’ ["'hello'", "hello"] 


When a group is not found at all (for example, if there is a question mark behind it), its position in the array contains undefined. If the group has matched several times, then only the last match will be in the array.

 console.log(/bad(ly)?/.exec("bad")); // β†’ ["bad", undefined] console.log(/(\d)+/.exec("123")); // β†’ ["123", "3"] 


Groups are useful for extracting parts of strings. If we do not just need to check if there is a date in the string, but extract it and create a date representing object, we can enclose the sequence of numbers in parentheses and select the date from the exec result.

But first, a small digression, in which we will learn the preferred way to store date and time in JavaScript.

Date type


JavaScript has a standard object type for dates - or rather, moments in time. It is called Date. If you simply create a date object with new, you will get the current date and time.

 console.log(new Date()); // β†’ Sun Nov 09 2014 00:07:57 GMT+0300 (CET) 


You can also create an object that contains the specified time.

 console.log(new Date(2015, 9, 21)); // β†’ Wed Oct 21 2015 00:00:00 GMT+0300 (CET) console.log(new Date(2009, 11, 9, 12, 59, 59, 999)); // β†’ Wed Dec 09 2009 12:59:59 GMT+0300 (CET) 


JavaScript uses a convention in which the month numbers start from zero, and the day numbers start from one. This is stupid and ridiculous. Watch out.

The last four arguments (hours, minutes, seconds and milliseconds) are optional, and in the case of absence are equated to zero.

Time stamps are stored as the number of milliseconds that have elapsed since the beginning of 1970. Negative numbers are used for time before 1970 (this is due to the Unix time agreement that was created around that time). The date object's getTime method returns this number. It is naturally large.
 console.log(new Date(2013, 11, 19).getTime()); // β†’ 1387407600000 console.log(new Date(1387407600000)); // β†’ Thu Dec 19 2013 00:00:00 GMT+0100 (CET) 


If you give the Date constructor one argument, it is taken as this number of milliseconds. You can get the current millisecond value by creating a Date object and calling the getTime method, or by calling the Date.now function.

The Date object for retrieving its components has methods getFullYear, getMonth, getDate, getHours, getMinutes, and getSeconds. There is also a getYear method that returns a rather useless two-digit code, such as 93 or 14.

Having enclosed the necessary parts of the template in parentheses, we can create a date object directly from a string.

 function findDate(string) { var dateTime = /(\d{1,2})-(\d{1,2})-(\d{4})/; var match = dateTime.exec(string); return new Date(Number(match[3]), Number(match[2]) - 1, Number(match[1])); } console.log(findDate("30-1-2003")); // β†’ Thu Jan 30 2003 00:00:00 GMT+0100 (CET) 


Word boundaries and lines


Unfortunately, findDate will also happily extract the meaningless date 00-1-3000 from the string "100-1-30000". The match can happen anywhere on the line, so in this case it will simply start from the second character and end on the penultimate.

If we need to force a match to take the entire string as a whole, we use the ^ and $ labels. ^ coincides with the beginning of the line, and $ with the end. Therefore, / ^ \ d + $ / coincides with a string consisting of only one or several digits, / ^! / Coincides with a line beginning with an exclamation mark, and / x ^ / does not match any line (there can be no x).

If, on the other hand, we just need to make sure that the date begins and ends at the word boundary, we use the label \ b. The word boundary can be the beginning or end of a line, or any place in a line where, on the one hand, there is an alphanumeric \ w character, and on the other, not an alphanumeric character.

 console.log(/cat/.test("concatenate")); // β†’ true console.log(/\bcat\b/.test("concatenate")); // β†’ false 


Note that the border label is not a character. This is just a limitation, meaning that a match occurs only if a certain condition is met.

Templates with a choice


Suppose you need to find out whether the text contains not just a number, but a number, followed by pig, cow, or chicken in the singular or plural form.

One could write three regulars and check them one by one, but there is a better way. Symbol | indicates the choice between the patterns to the left and to the right of it. And you can say the following:

 var animalCount = /\b\d+ (pig|cow|chicken)s?\b/; console.log(animalCount.test("15 pigs")); // β†’ true console.log(animalCount.test("15 pigchickens")); // β†’ false 


The brackets limit the portion of the pattern to which | is applied, and many such operators can be put one after another to indicate a choice of more than two options.

Search mechanism


Regular expressions can be thought of as flowcharts. The following chart describes the latest livestock example.



The expression matches the string if you can find the path from the left side of the diagram to the right. We remember the current position in the line, and each time we pass a rectangle, we check that the part of the line immediately after our position in it coincides with the contents of the rectangle.

So, checking the match of our regular season in the β€œthe 3 pigs” line as it passes through the flowchart looks like this:

- at position 4 there is a word boundary, and we pass the first rectangle
- starting with the 4th position we find the number, and we pass the second rectangle
- at position 5 one way closes back in front of the second rectangle, and the second goes further to the rectangle with a space. We have a space, not a number, and we choose the second path.
- now we are at position 6, the beginning of the β€œpigs”, and on the triple branching of the paths. There is no β€œcow” or β€œchicken” in the line, but there is a β€œpig”, so we choose this way.
- at position 9 after triple branching, one way goes around β€œs” and goes to the last rectangle with the word boundary, and the second one passes through β€œs”. We have an β€œs”, so we go there.
- at position 10 we are at the end of the line, and only the word boundary can match. The end of the line is considered a border, and we pass through the last rectangle. And so we successfully found our template.

In principle, regular expressions work as follows: the algorithm starts at the beginning of a line and tries to find a match there. In our case, there is a word boundary, so it passes the first rectangle - but there is no digit, so it stumbles on the second rectangle. Then he moves to the second character in the string, and tries to find a match there ... And so on, until he finds a match or does not reach the end of the line, in which case the match is not found.

Kickbacks


The regular / \ b ([01] + b | \ d + | [\ da-f] h) \ b / matches either a binary number, followed by b, or a decimal number without a suffix, or a hexadecimal number (digits from 0 to 9 or characters from a to h), followed by h. Relevant Chart:



In the search for a match, it may happen that the algorithm followed the upper path (binary number), even if there is no such number in the string. If there is a line β€œ103”, for example, it is clear that only having reached the number 3 the algorithm will understand that it is on the wrong path. In general, the string coincides with the regular, just not in this thread.

Then the algorithm rolls back. At the fork, he remembers the current position (in our case, this is the beginning of the line, immediately after the word boundary) so that you can go back and try another way if the selected one does not work. For the line β€œ103” after meeting with the troika, he will return and try to pass the path for decimal numbers. This will work, so a match will be found.

The algorithm stops as soon as it finds a complete match. This means that even if several options may come up, only one of them is used (in the order in which they appear in the regular season).

Kickbacks happen when using repetition operators, such as + and *. If you are looking for /^.*x/ in the β€œabcxe” line, part of the regular schedule. * Will try to absorb the entire line. The algorithm then realizes that he also needs an β€œx”. Since there is no β€œx” after the end of the line, the algorithm will try to find a match by rolling back one character. After abcx there is also no x, then it rolls back again, already to the substring abc. And after the line, he finds x and reports a successful match, at positions 0 through 4.

You can write a regular list that will lead to multiple rollbacks. Such a problem occurs when a template can match the input data in many different ways. For example, if we make a mistake when writing a regular number for binary numbers, we may accidentally write something like / ([01] +) + b /.



If the algorithm searches for such a pattern in a long string of zeros and ones that does not contain β€œb” at the end, it will first go through the inner loop until it runs out of numbers. Then he will notice that at the end there is no β€œb”, he will roll back one position, pass through the outer loop, give up again, try to roll back to another position along the inner loop ... And he will continue to search in this way, using both loops. That is, the amount of work with each character of the string will double. Even for a few dozen characters, the search for a match will take a very long time.

Replace method


Strings have a replace method that can replace part of a string with another string.

 console.log("".replace("", "")); // β†’  


The first argument can be regular, in which case the first occurrence of the regular record in the string is replaced. When the β€œg” (global, universal) option is added to the regular list, all occurrences are replaced, not just the first

 console.log("Borobudur".replace(/[ou]/, "a")); // β†’ Barobudur console.log("Borobudur".replace(/[ou]/g, "a")); // β†’ Barabadar 


It would make sense to pass the β€œreplace all” option through a separate argument, or through a separate method of type replaceAll. But unfortunately, the option is passed through the regular schedule itself.

The whole power of the regulars is revealed when we use the links to the groups found in the row, set in the regular season. For example, we have a string containing the names of people, one name per line, in the format "Last Name, First Name". If we need to swap them and remove the comma in order to get the "First Name", we write the following:

 console.log( "Hopper, Grace\nMcCarthy, John\nRitchie, Dennis" .replace(/([\w ]+), ([\w ]+)/g, "$2 $1")); // β†’ Grace Hopper // John McCarthy // Dennis Ritchie 


$ 1 and $ 2 in the replacement line refer to groups of characters enclosed in brackets. $ 1 is replaced with text that matched the first group, $ 2 with the second group, and so on, up to $ 9. The entire match is entirely contained in the $ & variable.

You can also pass a function as the second argument. For each replacement, a function will be called, the arguments of which will be the found groups (and the entire matching part of the string as a whole), and its result will be inserted into a new line.

A simple example:

 var s = "the cia and fbi"; console.log(s.replace(/\b(fbi|cia)\b/g, function(str) { return str.toUpperCase(); })); // β†’ the CIA and FBI 


But more interesting:

 var stock = "1 lemon, 2 cabbages, and 101 eggs"; function minusOne(match, amount, unit) { amount = Number(amount) - 1; if (amount == 1) //   ,  's'   unit = unit.slice(0, unit.length - 1); else if (amount == 0) amount = "no"; return amount + " " + unit; } console.log(stock.replace(/(\d+) (\w+)/g, minusOne)); // β†’ no lemon, 1 cabbage, and 100 eggs 


The code takes a string, finds all occurrences of numbers followed by a word, and returns a string where each number is reduced by one.

The group (\ d +) is in the amount argument, and (\ w +) is in the unit. The function converts the amount to a number - and it always works, because our pattern is just \ d +. And then makes changes to the word, in case there is only 1 item left.

Greed


It is easy with the help of replace to write a function that removes all comments from the JavaScript code. Here is the first attempt:

 function stripComments(code) { return code.replace(/\/\/.*|\/\*[^]*\*\//g, ""); } console.log(stripComments("1 + /* 2 */3")); // β†’ 1 + 3 console.log(stripComments("x = 10;// ten!")); // β†’ x = 10; console.log(stripComments("1 /* a */+/* b */ 1")); // β†’ 1 1 


The part before the β€œor” operator matches two slashes, followed by any number of characters, except for newline characters. The part that removes multi-line comments is more complex. We use [^], i.e.any non-empty character as a way to find any character. We cannot use the period because block comments continue on the new line, and the newline character does not match the period.

But the output of the previous example is incorrect. Why?

[^]* , . - , . , , . 4 , */ β€” , . - , .

- , (+, *, ?, and {}) , , , . (+?, *?, ??, {}?), , .

And this is what we need. By making the asterisk match in the minimum possible number of characters in the line, we absorb only one block of comments, and no more.

 function stripComments(code) { return code.replace(/\/\/.*|\/\*[^]*?\*\//g, ""); } console.log(stripComments("1 /* a */+/* b */ 1")); // β†’ 1 + 1 


Many errors occur when using greedy operators instead of non-greedy ones. When using the replay operator, always consider first the option of a non-greasy operator.

Dynamic creation of RegExp objects


In some cases, the exact pattern is unknown at the time of writing the code. For example, you will need to search for the username in the text, and enclose it in underscores. Since you only learn the name after starting the program, you cannot use a record with slashes.

But you can build a string and use the RegExp constructor. Here is an example:

 var name = ""; var text = "     ."; var regexp = new RegExp("\\b(" + name + ")\\b", "gi"); console.log(text.replace(regexp, "_$1_")); // β†’   __   . 


When creating the boundaries of the word, we have to use double slashes, because we write them in the normal line, and not in the regular line with straight slashes. The second argument for RegExp contains options for regulars - in our case β€œgi”, i.e. global and register-independent.

But what if the name is β€œdea + hl [] rd” (if our user is a Culhacker)? As a result, we get a meaningless regular record that does not find matches in the string.

We can add backslashes in front of any character we don’t like. We cannot add backslashes before letters, because \ b or \ n are special characters. But you can add slashes before any non-alphanumeric characters without problems.

 var name = "dea+hl[]rd"; var text = " dea+hl[]rd  ."; var escaped = name.replace(/[^\w\s]/g, "\\$&"); var regexp = new RegExp("\\b(" + escaped + ")\\b", "gi"); console.log(text.replace(regexp, "_$1_")); // β†’  _dea+hl[]rd_  . 


Search method


The indexOf method cannot be used with regulars. But there is a search method, which is just waiting for the regular list. Like indexOf, it returns the index of the first occurrence, or -1 if it did not happen.

 console.log(" word".search(/\S/)); // β†’ 2 console.log(" ".search(/\S/)); // β†’ -1 


Unfortunately, there is no way to set the method to search for a match, starting from a specific offset (as can be done with indexOf). That would be helpful.

LastIndex property


The exec method also does not provide a convenient way to start a search from a given position in the string. But an uncomfortable way gives.

Regular object has properties. One of these is source, which contains a string. Another one is lastIndex, which controls, in some conditions, where the next search for occurrences will begin.

These conditions include the presence of the global g option, and that the search should proceed using the exec method. A more sensible solution would be to simply allow an additional argument to be passed to exec, but reasonableness is not a fundamental feature in the JavaScript regularizer interface.

 var pattern = /y/g; pattern.lastIndex = 3; var match = pattern.exec("xyzzy"); console.log(match.index); // β†’ 4 console.log(pattern.lastIndex); // β†’ 5 


, exec lastIndex, . , lastIndex – lastIndex .

- exec lastIndex . , .

 var digit = /\d/g; console.log(digit.exec("here it is: 1")); // β†’ ["1"] console.log(digit.exec("and now: 1")); // β†’ null 


g , match. , , exec, .

 console.log("".match(//g)); // β†’ ["", ""] 


-. , – replace , lastIndex – , .



– , match , lastIndex exec.

 var input = "  3   ... 42  88."; var number = /\b(\d+)\b/g; var match; while (match = number.exec(input)) console.log(" ", match[1], "  ", match.index); // β†’  3  14 //  42  33 //  88  40 


, . match = re.exec(input) while, , , , .

INI



In conclusion, the chapter will consider the problem using regulars. Imagine that we are writing a program that collects information about our enemies via the Internet in automatic mode. (We will not write the whole program, only the part that reads the settings file. Sorry.) The file looks like this:

 searchengine=http://www.google.com/search?q=$1 spitefulness=9.7 ;       ;       [larry] fullname=Larry Doe type=   website=http://www.geocities.com/CapeCanaveral/11451 [gargamel] fullname=Gargamel type=  outputdir=/home/marijn/enemies/gargamel 


( , INI), :

β€” , ,
β€” , ,
β€” , - , =,

– .

– , name . , – .

, . 6 string.split("\n"). \n, β€” \r\n. split , /\r?\n/, \n \r\n .

 function parseINI(string) { //   ,     var currentSection = {name: null, fields: []}; var categories = [currentSection]; string.split(/\r?\n/).forEach(function(line) { var match; if (/^\s*(;.*)?$/.test(line)) { return; } else if (match = line.match(/^\[(.*)\]$/)) { currentSection = {name: match[1], fields: []}; categories.push(currentSection); } else if (match = line.match(/^(\w+)=(.*)$/)) { currentSection.fields.push({name: match[1], value: match[2]}); } else { throw new Error(" '" + line + "'   ."); } }); return categories; } 


The code passes all lines, updating the current section object. First, he checks whether the line can be ignored with the help of the regular /^\s*(;.*)?$/. See how it works? The part between the brackets is the same as the comments, eh? makes it so that the regular match coincides with the lines consisting of single spaces.

If the line is not a comment, the code checks to see if it starts a new section. If so, it creates a new object for the current section, to which subsequent settings are added.

The last sensible possibility is that the string is the usual setting, in which case it is added to the current object.

If none of the options worked, the function gives an error.

Notice how the frequent use of ^ and $ takes care that the expression matches the entire string, not the part. If you do not use them, the code as a whole will work, but sometimes it will produce strange results, and such an error will be difficult to track.

The if construct (match = string.match (...)) is similar to a trick using assignment as a condition in a while loop. Often you don’t know that the call to match will succeed, so you can access the resultant object only inside an if block that checks this. Not to break a beautiful chain of checks if, we assign the search result to a variable, and immediately use this assignment as a check.

International characters


- , « », JavaScript , . , «» JavaScript, 26 , - . é β, , \w ( \W, -).

, \s () , Unicode , , .

Unicode, Β« Β», Β« Β» Β« Β». JavaScript, , , .

Total



– , . .

/abc/
/[abc]/
/[^abc]/ ,
/[0-9]/
/x+/ x
/x+?/ ,
/x*/
/x?/
/x{2,4}/
/(abc)/
/a|b|c/
/\d/
/\w/ - («»)
/\s/
/./ ,
/\b/
/^/
/$/

test, , . exec, , . index, , .

match , search, . replace . , replace , , .

, . i , g , , , replace , .

RegExp .

– . , , . , , .

Exercises


Inevitably, when solving problems, you will encounter incomprehensible cases, and you can sometimes despair, seeing the unpredictable behavior of some regulars. Sometimes it helps to study the behavior of the regular season through an online service like debuggex.com, where you can see its visualization and compare it with the desired effect.

Regular golf

«» , . – , .

. . , . , .

β€” car cat
β€” pop prop
β€” ferret, ferry, ferrari
β€” , ious
β€” , , , .
β€”
β€” e

 //    verify(/.../, ["my car", "bad cats"], ["camper", "high art"]); verify(/.../, ["pop culture", "mad props"], ["plop"]); verify(/.../, ["ferret", "ferry", "ferrari"], ["ferrum", "transfer A"]); verify(/.../, ["how delicious", "spacious room"], ["ruinous", "consciousness"]); verify(/.../, ["bad punctuation ."], ["escape the dot"]); verify(/.../, ["hottentottententen"], ["no", "hotten totten tenten"]); verify(/.../, ["red platypus", "wobbling nest"], ["earth bed", "learning ape"]); function verify(regexp, yes, no) { // Ignore unfinished exercises if (regexp.source == "...") return; yes.forEach(function(s) { if (!regexp.test(s)) console.log("  '" + s + "'"); }); no.forEach(function(s) { if (regexp.test(s)) console.log("  '" + s + "'"); }); } 



Suppose you wrote a story, and everywhere for the designation of dialogues used single quotes. Now you want to replace the quotes of dialogs with double ones, and leave single quotes in the abbreviations of the words such as aren't.

Come up with a pattern that distinguishes between these two uses of quotes, and write a call to the replace method that performs the replacement.

Numbers again

/\d+/.

, , JavaScript. , , 5e-3 1E10 – - . , , . , .5 5. – , – .

 //   . var number = /^...$/; // Tests: ["1", "-1", "+15", "1.55", ".5", "5.", "1.3e2", "1E-4", "1e+12"].forEach(function(s) { if (!number.test(s)) console.log("  '" + s + "'"); }); ["1a", "+-1", "1.2.3", "1+1", "1e4.5", ".5.", "1f5", "."].forEach(function(s) { if (number.test(s)) console.log("  '" + s + "'"); }); 

Source: https://habr.com/ru/post/242695/


All Articles