Yesterday, May 23, Perl 5.30.0 was released. The news was announced in the Usenet group perl.perl5.porters, one of the key Perl developers - Sawyer X.
Compared with the previous stable release, 5.28.0, released about 11 months ago, about 620,000 lines of code were changed, the changes affected 1300 files, 58 authors took part in the development. Changes to the source code itself (only .pm, .t, .c, and .h files) are estimated at ~ 510,000 lines and 750 files.
The development of the next branch 5.31 is open. The release of the next stable release is scheduled for May 2020.
Key changes:
The Perl API C functions sv_utf8_downgrade and sv_utf8_decode are no longer considered experimental.
Implemented experimental support for variable-length lookbehind expressions, such as for example "(? <= Foo?)" And "(? <! Ba {1.9} r)" (previously they led to an error)
The maximum value of the size specifier ("n") in regular expression blocks "{m, n}" has been increased from 32767 to 65534.
Unicode support 12.1.
Added limited support for wildcards in Unicode property value specifications (I don’t know how to translate correctly). For example, the expression "qr! \ P {nv = / (? X) \ A [0-5] \ z /}!" Allows you to select all Unicode characters that define numbers from 0 to 5, including Thai or Bengali numbers.
Implemented qr '\ N {name}' support (named characters inside regular expressions, delimited by single quotes; previously such a regex resulted in an error).
Now you can compile Perl using only thread-safe locale operations (-Accflags = '- DUSE_THREAD_SAFE_LOCALE').
The combination of the flags "-Dv" (advanced debugging output) and "-Dr" (regular expression debugging) now leads to the inclusion of all possible modes of regular expression debugging.
In pack (), protection against the return of incorrect Unicode sequences has been added.
Remove features and incompatible changes:
Assigning a non-zero value to a special variable $ [ (index of the first element of the array) now leads to a fatal error.
As character separators of strings and patterns, now only graphemes are allowed. (prohibited compound unicode characters).
Some previously obsolete ways to use the unshielded left “{” in the regular expression patterns are now prohibited.
Calling sysread (), syswrite (), send () or recv () while processing a handle: utf8 is now a fatal error.
It is forbidden to use "my" in identically false conditional statements (for example, "my $ x if 0").
Removed support for special variable $ * (multi-line search). The correct alternatives are "/ s" and "/ m".
Removed support for special $ # (formatted output of numbers).
The function name dump () now needs to be explicitly qualified (CORE :: dump).
Removed File :: Glob :: glob function (need to use File :: Glob :: bsd_glob).
It was planned to stop supporting the use in the XS code (C blocks) of macros that perform operations with UTF-8, but then it was decided to postpone it to version 5.32.
Performance improvements:
The translation of UTF-8 to code points (I don’t know how to translate) is now implemented as a finite state machine, which, among other things, leads to increased performance - for example, ord ("\ x7fff") now requires 12% fewer instructions. Verifying the correctness of UTF-8 character sequences is also implemented as a finite state machine and is faster.
Recursive calls removed from finalize_op ().
Minor optimizations are made to the code of folding identical symbols and defining classes of symbols in regular expressions.
Optimized transformations of signed type identifiers to unsigned (IV to UV).
The algorithm for converting integers into a string is accelerated by processing two digits instead of one at a time.
Improvements made based on the results of the analysis by LGTM.
Optimized code in regcomp.c, regcomp.h and regexec.c files.
Matching regular expressions like "qr / [^ a] /" is greatly accelerated for cases where "a" is an ASCII character (non-ASCII cases "a" may also show an increase in performance, but under certain conditions).