'import matplotlib.pyplot as plt'
. s = "*^2/(*^2)"
res = [('',1.0), ('',2.0), ('',-1.0), ('',-2.0)]
s
division by multiplication, opening the brackets and clearly putting down the degrees of the units, we get: N * m ^ 2 / (kg * s ^ 2) = H ^ 1 * m ^ 2 * kg ^ -1 * s ^ -2 .res
contains the name of the unit of measurement and the degree to which it must be built. Between tuples you can mentally put multiplication signs. from pyparsing import *
ph_unit = Word(alphas, alphas+'.')
Word
class now has 2 arguments. The first argument is responsible for what should be the first character of the word, the second argument is responsible for what other characters of the word can be. The unit of measurement necessarily begins with a letter, so we put the first argument alphas
. In addition to letters, the unit of measurement can contain a period (for example, mm.rt.st), so the second argument for Word
is alphas + '.'
.alphas
means not just letters, but letters of the English alphabet. rus_alphas = ''
ph_unit = Word(alphas+rus_alphas, alphas+rus_alphas+'.')
ph_unit.parseString("").asList() # : ['\xd0\xbc\xd0\xbc']
bprint
function (better print) works: def bprint(obj): print(obj.__repr__().decode('string_escape'))
bprint(ph_unit.parseString("").asList()) # : ['']
test_num = "-123.456e-3"
int_num = Word(nums)
Suppress()
). pm_sign = Optional(Suppress("+") | Literal("-"))
Literal()
means an exact match to the text string. Thus, the expression for pm_sign
means that it is necessary to find an optional + symbol in the text, which should not be output to the result of parsing, or an optional minus symbol. float_num = pm_sign + int_num + Optional('.' + int_num) + Optional('e' + pm_sign + int_num)
float_num.parseString('-123.456e-3').asList() # ['-', '123', '.', '456', 'e', '-', '3']
Combine()
: float_num = Combine(pm_sign + int_num + Optional('.' + int_num) + Optional('e' + pm_sign + int_num))
float_num.parseString('-123.456e-3').asList() # ['-123.456e-3']
ParseAction()
: float_num = Combine(pm_sign + int_num + Optional('.' + int_num) + Optional('e' + pm_sign + int_num)).setParseAction(lambda t: float(t.asList()[0]))
lambda
, whose argument is t
. First we get the result as a list (t.asList())
. Since the resulting list has only one element; you can immediately extract it: t.asList()[0]
. The float()
function converts text to a floating point number. If you work in Sage, you can replace float
with RR
, the constructor of the Sage real number class. single_unit = ph_unit + Optional('^' + float_num)
bprint(single_unit.parseString("^2").asList()) # : ['', '^', 2.0]
Suppress()
, to convert the list into a tuple - ParseAction()
: single_unit = (ph_unit + Optional(Suppress('^') + float_num)).setParseAction(lambda t: tuple(t.asList()))
bprint(single_unit.parseString("^2").asList()) # : [('', 2.0)]
"(^2/ (^2 * ))"
). The possibility of nesting some expressions with brackets to others is a source of recursion. Let's go to Pyparsing. unit_expr = Suppress('(') + single_unit + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr))) + Suppress(")")
Optional
contains that part of the string that may or may not be present. OneOrMore
(translated as “one or more”) contains the part of the string that should appear in the text at least once. OneOrMore
contains two “addends”: first we look for the multiplication and division sign, then the unit of measurement or the nested expression.unit_expr
cannot leave unit_expr
: to the left and right of the equal sign there is unit_expr
, which clearly indicates recursion. This problem is solved very simply: you need to change the assignment sign to <<, and in the line before unit_expr
add the assignment of a special class Forward()
: unit_expr = Forward() unit_expr << Suppress('(') + single_unit + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr))) + Suppress(")")
Forward()
class in the line above. bprint(unit_expr.parseString("(*/^2)").asList()) # : [('',), '*', ('',), '/', ('', 2.0)]
parse_unit = (unit_expr | single_unit) + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr)))
(a | b) + (c | d)
. The brackets are required and have the same role as in mathematics. Using parentheses, we want to indicate that we first need to check that the first term is unit_expr
or single_unit
, and the second term is an optional expression. If you remove the brackets, it turns out that parse_unit
is unit_expr
or single_unit
+ an optional expression, which is not exactly what we intended. The same reasoning applies to the expression inside Optional()
. from pyparsing import * rus_alphas = '' ph_unit = Word(rus_alphas+alphas, rus_alphas+alphas+'.') int_num = Word(nums) pm_sign = Optional(Suppress("+") | Literal("-")) float_num = Combine(pm_sign + int_num + Optional('.' + int_num) + Optional('e' + pm_sign + int_num)).setParseAction(lambda t: float(t.asList()[0])) single_unit = (ph_unit + Optional(Suppress('^') + float_num)).setParseAction(lambda t: tuple(t.asList())) unit_expr = Forward() unit_expr << Suppress('(') + single_unit + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr))) + Suppress(")") parse_unit = (unit_expr | single_unit) + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr)))
print(s) # s = "*^2/(*^2)" — . . bprint(parse_unit.parseString(s).asList()) # : [('',), '*', ('', 2.0), '/', ('',), '*', ('', 2.0)]
Group()
, which we apply to unit_expr
: unit_expr = Forward() unit_expr << Group(Suppress('(') + single_unit + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr))) + Suppress(")"))
bprint(parse_unit.parseString(s).asList()) # : [('',), '*', ('', 2.0), '/', [('',), '*', ('', 2.0)]]
'unit_name'
, and its degree as 'unit_degree'
. In setParseAction()
we write an anonymous function lambda()
, which will put 1 where the user does not specify the degree of the unit of measurement). On pyparsing: single_unit = (ph_unit('unit_name') + Optional(Suppress('^') + float_num('unit_degree'))).setParseAction(lambda t: (t.unit_name, float(1) if t.unit_degree == "" else t.unit_degree))
bprint(parse_unit.parseString(s).asList()) # : [('', 1.0), '*', ('', 2.0), '/', [('', 1.0), '*', ('', 2.0)]]
float(1)
, it would be possible to write just 1.0
, but in Sage, in this case, you will get not the type float
, but your own type Sage for real numbers.transform_unit()
, which we will use in setParseAction()
for parse_unit
: def transform_unit(unit_list, k=1): res = [] for v in unit_list: if isinstance(v, tuple): res.append(tuple((v[0], v[1]*k))) elif v == "/": k = -k elif isinstance(v, list): res += transform_unit(v, k=k) return(res) parse_unit = ((unit_expr | single_unit) + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr)))).setParseAction(lambda t: transform_unit(t.asList()))
bprint(transform_unit(parse_unit.parseString(s).asList())) # : [('', 1.0), ('', 2.0), ('', -1.0), ('', -2.0)]
transform_unit()
function removes nesting. In the conversion process, all brackets are expanded. If there is a dividing sign in front of the parenthesis, the sign of the degree of units in brackets is reversed. unit_db = {'':{'':1, '':1/10, '':1/100, '':1/1000, '':1000, '':1/1000000}, '':{'':1}, '':{'':1, '':1000}, '':{'':1}, '':{'':1, '':0.001}}
unit_set = set([t for vals in unit_db.values() for t in vals])
check_unit
function, which will check the unit of measurement, and insert it into the setParseAction
for ph_unit
: def check_unit(unit_name): if not unit_name in unit_set: raise ValueError(" : " + unit_name) return(unit_name) ph_unit = Word(rus_alphas+alphas, rus_alphas+alphas+'.').setParseAction(lambda t: check_unit(t.asList()[0]))
ph_unit.parseString("") # : Error in lines 1-1 Traceback (most recent call last): … File "", line 1, in <lambda> File "", line 3, in check_unit ValueError: :
"from pyparsing import *"
to replace * with the used classes. from pyparsing import nums, alphas, Word, Literal, Optional, Combine, Forward, Group, Suppress, OneOrMore def bprint(obj): print(obj.__repr__().decode('string_escape')) # unit_db = {'':{'':1, '':1/10, '':1/100, '':1/1000, '':1000, '':1/1000000}, '':{'':1}, '':{'':1, '':1000}, '':{'':1}, '':{'':1, '':0.001}} unit_set = set([t for vals in unit_db.values() for t in vals]) # rus_alphas = '' def check_unit(unit_name): """ . """ if not unit_name in unit_set: raise ValueError(" : " + unit_name) return(unit_name) ph_unit = Word(rus_alphas+alphas, rus_alphas+alphas+'.').setParseAction(lambda t: check_unit(t.asList()[0])) # int_num = Word(nums) pm_sign = Optional(Suppress("+") | Literal("-")) float_num = Combine(pm_sign + int_num + Optional('.' + int_num) + Optional('e' + pm_sign + int_num)).setParseAction(lambda t: float(t.asList()[0])) # single_unit = (ph_unit('unit_name') + Optional(Suppress('^') + float_num('unit_degree'))).setParseAction(lambda t: (t.unit_name, float(1) if t.unit_degree == "" else t.unit_degree)) # unit_expr = Forward() unit_expr << Group(Suppress('(') + single_unit + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr))) + Suppress(")")) # def transform_unit(unit_list, k=1): """ , , * / """ res = [] for v in unit_list: if isinstance(v, tuple): res.append(tuple((v[0], v[1]*k))) elif v == "/": k = -k elif isinstance(v, list): res += transform_unit(v, k=k) return(res) parse_unit = ((unit_expr | single_unit) + Optional(OneOrMore((Literal("*") | Literal("/")) + (single_unit | unit_expr)))).setParseAction(lambda t: transform_unit(t.asList())) # s = "*^2/(*^2)" bprint(parse_unit.parseString(s).asList())
Source: https://habr.com/ru/post/241670/
All Articles