python - Group m groups with regex -
i have regex replaces letter n (\w{1,}) -- meaning word can stand in letter n. want make group out of m instances of (\w{1,}) -- i.e add parens around m instances of (\w{1,}), this:
"(" + "(\w{1,}), (\w{1,}), (\w{1,}) .... (\w{1,})" + ")", (\w{1,}) occurs m times how can that? know like
re.sub(\w{1,}){2,}, inputstring, "(" + many instances of (\w{1,}) pattern able match + "))
how express, in regex, pattern matched m times? (so can replace, surrounded parenthesis).
if understand question correctly, you're writing 1 regex produce regex. is, you're using regex replacement build pattern regex search. input includes kind of wildcard value (e.g. "n") need replace create search pattern. in search pattern, adjacent wildcard values should combined single capturing group (so "n n bacon n" give 2 capturing groups, 1 first 2 words , 1 more last). think can if first capture adjacent wildcards, substitute individual instances within larger group.
here's code that:
import re def make_pattern(template, wildcard="n"): replacement_pattern = r"\b{0}\b(?:\s+{0}\b)*".format(wildcard) def replacement_func(match): return "(" + re.sub(wildcard, r"\w+", match.group()) + ")" return re.sub(replacement_pattern, replacement_func, template) the \b escape sequences in replacement_pattern necessary prevent occurrences of wildcard being treated such if part of larger word (like "n" @ end of "bacon"). closure replacement_func uses additional regex replacement swap out wildcards, while preserving spacing between them (so template "n n n n" match differently "n n n n"). suppose regular string replacement (with str.replace) instead, if wanted to. couldn't resist 3 levels of regexing in 1 solution.
here's example run:
>>> make_pattern("n n bacon n") '(\\w+\\s+\\w+) bacon (\\w+)' >>> re.findall(make_pattern("n n bacon n"), "spam spam eggs bacon , spam") [('spam eggs', 'and')]
Comments
Post a Comment