Rule0
- simply answers the question "did it Rule0
?" Without changing the contents of the stack.Rule1
- pushes one object Rule1
stack of values.Rule2
- Rule2
two objects Rule2
stack of values.RuleN
- pushes N objects to the stack of values ​​using the semantics of the Shapeless library. You do not need to know Shapeless to work with Parboiled2 (although it will be useful).PopRule
- retrieves values ​​from the stack, without placing new values ​​there. type Rule2[+A, +B] = RuleN[A :: B :: HNil]
Rule7
problem”: the Rule8
class Rule8
longer exists and putting eight items on the stack of values ​​does not work, even if you really want to. There are various ways to work around this problem, and I’ll tell you about one of them in the next article.
capture
function for this: it matches the data with the rule and, if successful, puts it on a stack of values.
Rule0
, from which we want to get at least something:
def User: Rule0 = rule { FirstName ~ Separator ~ LastName }
def User: Rule2[String, String] = rule { capture(FirstName) ~ Separator ~ capture(LastName) }
Rule0
, but Rule2
, since it captures and sends two lines to the stack of values. However, the type can not be specified, the compiler will understand everything himself.
capture
function. Depending on the type of the return value, different forms of the ~>
operator are used, which makes the use of this operator simple and intuitive.
In Parboiled1, the capture was performed implicitly, which I find very uncomfortable.
def UnsignedInteger: Rule1[Int] = rule { capture(Digit.+) ~> (numStr => numStr.toInt) }
def UnsignedInteger: Rule1[Int] = rule { capture(Digit.+) ~> (_.toInt) }
(String => Int)
, which determines the type of our rule - Rule1[Int]
. It is allowed to apply the ~>
operator to a typed rule as well, for example, the following rule matches an integer, but pushes its double value onto the stack:
def TwoTimesLarger = rule { UnsignedInteger ~> (i => i * 2) }
TwoTimesLarger
rule will remain Rule1[Int]
, only a different value will be on the stack.
Explicitly specifying the type of arguments to lambda functions is not the best idea (at least at the time of this writing). In the Scala compiler, there is a very unpleasant bug that will not allow your code to compile properly.
def UserWithLambda: Rule2[String, String] = rule { capture(FirstName) ~ Separator ~ capture(LastName) ~> ((firstName, lastName) => ...) }
def UserName = rule { User ~> ((firstName, lastName) => s"$firstName $lastName") }
User
rule was Rule2[String, String]
, applying the lambda function to it, we created a new UserFirstName
rule with the type Rule1[String]
.
(foo: Rule2[Int, String]) ~> (_.toDouble) // foo: Rule2[Int, Double].
(foo: Rule0) ~> (() => 42) // foo: Rule1[Int].
(foo: Rule1[Event]) ~> (e => e::DateTime.now()::"localhost"::HNil) // foo: RuleN[Event::DateTime::String::HNil]
HList
. The type of the resulting rule will be RuleN[Event::DateTime::String::HNil]
.
Unit
. The type of the resulting rule, as you probably guessed, is Rule0
:
(foo: rule1[String]) ~> (println(_)) // foo: Rule0
case class Person(name: String, age: Int) (foo: Rule2[String, Int]) ~> Person // foo: Rule1[Person]
~> (Person(_, _))
.
~~>
from Parboiled1. There are other ways of applying ~>
, but you will learn about them not from me, but from the documentation. I will only note that the ~>
operator is implemented in the Parboiled2 code in a very non-trivial way, but no matter how difficult its definition would look, it is a pleasure to use it. Perhaps the best technical decision made at the stage of creating a DSL.
run
behaves exactly the same as ~>
, except for the little inconvenience when in the case of run
compiler does not automatically infer types and must be explicitly designated. The operator is a very convenient tool for creating untestable side effects, for example as follows:
def RuleWithSideEffect = rule { capture(EmailAddress) ~ run { address: String => send(address, subj, message) } ~ EOI }
Rule0
, and the matching string is not needed by anyone and will not fall into any stack of values, which is sometimes necessary. Parboiled1 users probably noticed that in the context described above, run
behaves the same way as the ~%
operator.
Warning: When using side effects, do not flirt with a stack of values. Yes, you can get direct access to it, but for a number of reasons it is better not to do it.
push
function places data on a stack of values ​​if the corresponding rule matches it. In practice, I have not had to use it often, since most of the work can be done by the ~>
operator, but there is an example in which push
simply shines:
sealed trait Bool case object True extends Bool case object False extends Bool def BoolMatch = rule { "true" ~ push(True) | "false" ~ push(False) }
Although this is not noted anywhere, this rule follows the semantics of call-by-name and is evaluated every time, and therefore its argument is calculated every time. This usually has a detrimental effect on performance, sopush
best used with constants and only with constants.
run
and ~>
, the type of the value passed to push
determines the contents of the stack and the type of rule being created.
~>
we get the variable of the string type as a parameter of the lambda function. After some operations with the side, we can feed it to some subparser and so on. In practice, it was not necessary to apply, but you should know that there is such an opportunity.
sealed trait AstNode case class KeyValueNode(key: String, value: String) extends AstNode case class BlockNode(name: String, nodes: Seq[AstNode]) extends AstNode
sealed trait AstNode { def name: String } case class KeyValueNode (override val name: String, value: String) extends AstNode case class BlockNode (override val name: String, nodes: Seq[AstNode]) extends AstNode
~>
operator. Capture we will do "on the spot" (in the rules for the key and value). And we start with the key:
// def Key: Rule1[String] = rule { capture(oneOrMore(KeySymbol)) }
capture
and that's it - Parboiled thinks about us. The string will be sent to the stack. But with the capture of values, the situation is more complicated. If we verify an operation similar to the key, we will receive a string with quotes. We need them? Therefore, we will capture the line:
def QuotedString: Rule1[String] = rule { '"' ~ capture(QuotedStringContent) ~ '"' }
Capturecapture
needs to be done once. And preferably, in the rule where it should have happened
def KeyValuePair: Rule1[AstNode] = rule { Key ~ MayBeWS ~ "=" ~ MayBeWS ~ Value ~> KeyValueNode }
// , def Node: Rule1[AstNode] = rule { KeyValuePair | Block }
Nodes
rule does not require changes, unless you specify the type of value to be placed on the stack:
def Nodes: Rule1[Seq[AstNode]] = rule { MayBeWS ~ zeroOrMore(Node).separatedBy(NewLine ~ MayBeWS) ~ MayBeWS }
def BlockName: Rule1[String] = rule { capture(oneOrMore(BlockNameSymbol.+)) }
def Block: Rule1[AstNode] = rule { BlockName ~ MayBeWS ~ BlockBeginning ~ Nodes ~ BlockEnding ~> BlockNode }
def Root: Rule1[AstNode] = rule { Nodes ~ EOI ~> {nodes: Seq[AstNode] => BlockNode(RootNodeName, nodes)} }
"$root"
, "!root!"
or "%root%"
. Will work. I prefer the blank line:
val RootNodeName = ""
/** * parboiled */ trait NodeAccessDsl { this: AstNode => def isRoot = this.name == BkvParser.RootNodeName lazy val isBlockNode = this match { case _: KeyValueNode => false case _ => true } /** * * - */ def pairs: Seq[KeyValueNode] = this match { case BlockNode(_, nodes) => nodes collect { case node: KeyValueNode => node } case _ => Seq.empty } /** * * */ def blocks: Seq[BlockNode] = this match { case BlockNode(_, nodes) => nodes collect { case node: BlockNode => node } case _ => Seq.empty } /** * "-" */ def getValue: Option[String] = this match { case KeyValueNode(_, value) => Some(value) case _ => None } }
Source: https://habr.com/ru/post/270609/