syntax token

Documentation for syntax token assembled from the following types:

language documentation Grammars

From Grammars

(Grammars) declarator token token

The main ingredient of grammars is named regexes. While the syntax of Raku Regexes is outside the scope of this document, named regexes have a special syntax, similar to subroutine definitions: [1]

my regex number { \d+ [ \. \d+ ]? }

In this case, we have to specify that the regex is lexically scoped using the my keyword, because named regexes are normally used within grammars.

Being named gives us the advantage of being able to easily reuse the regex elsewhere:

say so "32.51" ~~ &number;                         # OUTPUT: «True␤» 
say so "15 + 4.5" ~~ /<number>\s* '+' \s*<number>/ # OUTPUT: «True␤» 

regex isn't the only declarator for named regexes. In fact, it's the least common. Most of the time, the token or rule declarators are used. These are both ratcheting, which means that the match engine won't back up and try again if it fails to match something. This will usually do what you want, but isn't appropriate for all cases:

my regex works-but-slow { .+ q }
my token fails-but-fast { .+ q }
my $s = 'Tokens won\'t backtrack, which makes them fail quicker!';
say so $s ~~ &works-but-slow# OUTPUT: «True␤» 
say so $s ~~ &fails-but-fast# OUTPUT: «False␤» 
                              # the entire string get taken by the .+ 

Note that non-backtracking works on terms, that is, as the example below, if you have matched something, then you will never backtrack. But when you fail to match, if there is another candidate introduced by | or ||, you will retry to match again.

my token tok-a { .* d  };
my token tok-b { .* d | bd };
say so "bd" ~~ &tok-a;        # OUTPUT: «False␤» 
say so "bd" ~~ &tok-b;        # OUTPUT: «True␤» 

Rules

The only difference between the token and rule declarators is that the rule declarator causes :sigspace to go into effect for the Regex:

my token non-space-y { 'once' 'upon' 'a' 'time' }
my rule space-y { 'once' 'upon' 'a' 'time' }
say so 'onceuponatime'    ~~ &non-space-y# OUTPUT: «True␤» 
say so 'once upon a time' ~~ &non-space-y# OUTPUT: «False␤» 
say so 'onceuponatime'    ~~ &space-y;     # OUTPUT: «False␤» 
say so 'once upon a time' ~~ &space-y;     # OUTPUT: «True␤»