ruby - How to find a word NOT inside <a> tag? -
i need regular expressions. task simular twitter's hashtags: have string words staring #
. example:
foo #bar hello
i'm replacing hashtags links before saved database , strings this:
foo <a href="bar">#bar</a>
after need re-parse string , don't want replace #bar
inside <a>
tag twice. need regexp should find word beginning #
, placed not inside >
, <
, >
, </a>
.
to input:
foo #bar hello
to output:
foo <a href="bar">#bar</a> hello
idempotently, can pass output through function , not change, use this:
str1 = "foo #bar hello" str2 = 'foo <a href="bar">#bar</a> hello' replace_func = -> str { str.sub(/#(\w+)(?=[^<]*?(?:<[^\/]|$))/, '<a href="\1">#\1</a>')} replace_func[str1] replace_func[str2] # both return: "foo <a href=\"bar\">#bar</a> hello"
additionally nokogiri can used simply:
require 'nokogiri' doc = nokogiri::xml('<p>' + you_string + '</p>') doc.search('//p').each |node| node.content = node.content.sub(/#\w+/) end
Comments
Post a Comment