regex - Match the nth longest possible string in Perl -
the pattern matching quantifiers of perl regular expression "greedy" (they match longest possible string). force match "ungreedy", ? can appended pattern quantifier (*, +).
here example:
#!/usr/bin/perl $string="111s11111s"; #-- greedy match $string =~ /^(.*)s/; print "$1\n"; # prints 111s11111 #-- ungreedy match $string =~ /^(.*?)s/; print "$1\n"; # prints 111 but how 1 can find second, third , .. possible string match in perl? make simple example of yours --if need better one.
utilize conditional expression, code expression, , backtracking control verbs.
my $skips = 1; $string =~ /^(.*)s(?(?{$skips-- > 0})(*fail))/; the above use greedy matching, cause largest match intentionally fail. if wanted 3rd largest, set number of skips 2.
demonstrated below:
#!/usr/bin/perl use strict; use warnings; $string = "111s11111s11111s"; $string =~ /^(.*)s/; print "greedy match - $1\n"; $string =~ /^(.*?)s/; print "ungreedy match - $1\n"; $skips = 1; $string =~ /^(.*)s(?(?{$skips-- > 0})(*fail))/; print "2nd greedy match - $1\n"; outputs:
greedy match - 111s11111s11111 ungreedy match - 111 2nd greedy match - 111s11111 when using such advanced features, important have full understanding of regular expressions predict results. particular case works because regex fixed on 1 end ^. means know each subsequent match 1 shorter previous. however, if both ends shift, not predict order.
if case, find them all, , sort them:
use strict; use warnings; $string = "111s11111s"; @seqs; $string =~ /^(.*)s(?{push @seqs, $1})(*fail)/; @sorted = sort {length $b <=> length $a} @seqs; use data::dump; dd @sorted; outputs:
("111s11111s11111", "111s11111", 111) note perl versions prior v5.18
perl v5.18 introduced change, /(?{})/ , /(??{})/ have been heavily reworked, enabled scope of lexical variables work in code expressions utilized above. before then, above code result in following errors, demonstrated in this subroutine version run under v5.16.2:
variable "$skips" not stay shared @ (re_eval 1) line 1. variable "@seqs" not stay shared @ (re_eval 2) line 1. the fix older implementations of re code expressions declare variables our, , further coding practices, localize them when initialized. demonstrated in modified subroutine version run under v5.16.2, or put below:
local our @seqs; $string =~ /^(.*)s(?{push @seqs, $1})(*fail)/;
Comments
Post a Comment