Discussion:
Global <skip> directive is quite literal
(too old to reply)
yary
2013-03-31 15:25:49 UTC
Permalink
I've been trying to get a global <skip> directive- before all rule
definitions- to work, and it seems it must not be quoted or have any
spaces. It seems to run counter to the Parse::RecDescent
documantation. Following is code, I would expect each variant to
parse, but only the first 2 are OK, and the last crashes with
"Internal error in generated parser code!". Is that intended, or
should I file a bug report?

-y

#!/usr/bin/env perl
use warnings;
use strict;
use Parse::RecDescent;

$::RD_WARN=1;

my $plain_grammar='words : (/\w+/)(s) /\Z/';

sub try_grammar {
my ($name, $trace, $definition)=@_;
print $name,": $definition = ";
print $::RD_HINT=$::RD_TRACE=$trace if defined $trace;
my $grammar = Parse::RecDescent->new("$definition\n$plain_grammar");
print defined $grammar->words("Please work") ? "OK" : "Didn't parse";
print "\n";
undef $::RD_HINT,$::RD_TRACE;
}

try_grammar('no skip directive',undef,'');
try_grammar('bare skip',undef,"<skip:\\s*>");
try_grammar('bare skip, with a preceeding space',undef,"<skip: \\s*>");
try_grammar('bare skip, plus sign',undef,"<skip:\\s+>");
try_grammar('qr skip',undef,"<skip: qr/\\s*/>");
try_grammar('apostrophe skip',1,"<skip: '\\s*'>");
j***@cpan.org
2013-04-02 03:03:56 UTC
Permalink
Interesting. The global and in-rule skip directives are handled a bit
differently. The in-rule directives essentially go through an eval
step, where the text after the ':' and before the terminating '>' is
placed literally into the generated parser code as:
$code .= '$skip =' . $1;
So whatever whitespace and quoting structures you used get translated
directly into the global eval() of the generated parser.

The global skip directive feeds the extracted skip directive text into a
variable that is then put into the parser code as something like:
$code .= '$skip = \'$1\'';

Later the internal $skip variable is used in a regex something like:
if ($text =~ s/\A($skip)//) { SkipMatched(); ...}

So in instances where the skip directive contents could be quoted via
single quotes into something that would work as a variable interpolated
into a regex, the global directive works. Otherwise, you get odd
behavior, as you noticed.

The correct thing to do is for P::RD to treat the global skip the same
as it does everything else, and require that you quote the contents of
the skip directive, and eval the result in the generated parser. I've
made these changes here:
https://github.com/jtbraun/Parse-RecDescent/tree/global_skip

And would appreciate it if you'd give them a try before I merge them
back into the main line and push an update to PAUSE (and a bug would be
appreciated).

Additionally, one of your test cases will fail unexpectedly:

try_grammar('bare skip, plus sign',undef,"<skip:\\s+>");

The generated parser always attempts to match $skip before a terminal,
including the first one. There's no whitespace at the beginning of your
string, which will lead to a failure to parse.

Thanks for the report,

Jeremy
Post by yary
I've been trying to get a global <skip> directive- before all rule
definitions- to work, and it seems it must not be quoted or have any
spaces. It seems to run counter to the Parse::RecDescent
documantation. Following is code, I would expect each variant to
parse, but only the first 2 are OK, and the last crashes with
"Internal error in generated parser code!". Is that intended, or
should I file a bug report?
-y
#!/usr/bin/env perl
use warnings;
use strict;
use Parse::RecDescent;
$::RD_WARN=1;
my $plain_grammar='words : (/\w+/)(s) /\Z/';
sub try_grammar {
print $name,": $definition = ";
print $::RD_HINT=$::RD_TRACE=$trace if defined $trace;
my $grammar = Parse::RecDescent->new("$definition\n$plain_grammar");
print defined $grammar->words("Please work") ? "OK" : "Didn't parse";
print "\n";
undef $::RD_HINT,$::RD_TRACE;
}
try_grammar('no skip directive',undef,'');
try_grammar('bare skip',undef,"<skip:\\s*>");
try_grammar('bare skip, with a preceeding space',undef,"<skip: \\s*>");
try_grammar('bare skip, plus sign',undef,"<skip:\\s+>");
try_grammar('qr skip',undef,"<skip: qr/\\s*/>");
try_grammar('apostrophe skip',1,"<skip: '\\s*'>");
Loading...