Skip to content
You are viewing the next version of this website. See current version

Formal grammar

This document uses Pomsky syntax to describe Pomsky’s syntax. Here’s an incomplete summary, which is enough to read the grammar:

  • Variables are declared as let var_name = expression;. This means that var_namecan be parsed by parsingexpression.

  • Verbatim text is wrapped in double quotes ("") or single quotes ('').

  • A * after a rule indicates that it repeats 0 or more times.

  • A + after a rule indicates that it repeats 1 or more times.

  • A ? after a rule indicates that the rule is optional.

  • Rules can be grouped together by wrapping them in parentheses (()).

  • Alternative rules are each preceded by a vertical bar (|).

Comments start with # and end at the end of the same line. Comments and whitespace are ignored; they can be added anywhere between tokens. Tokens are

  • identifiers (e.g. foo)
  • keywords and reserved words (e.g. lazy)
  • operators and punctuation (e.g. << or ;)
  • numbers (e.g. 30)
  • string literals (e.g. "foo")
  • codepoints

as documented here in detail.

Even though this grammar is written using Pomsky syntax, it isn’t actually accepted by the Pomsky compiler, because it uses cyclic variables.

let Expression = Statement* Alternation;

See Alternation.

let Statement =
| LetDeclaration
| Modifier
| Test;

See LetDeclaration, Modifier, Test.

An expression which can have a prefix or suffix.

let FixExpression =
| Lookaround
| Negation
| Repetition;

See Lookaround, Negation, Repetition.

let AtomExpression =
| String
| CodePoint
| Group
| CharacterSet
| InlineRegex
| Boundary
| Reference
| NumberRange
| Variable
| Dot
| Recursion;

See String, CodePoint, Group, CharacterSet, InlineRegex, Boundary, Reference, NumberRange, Dot, Recursion.