Capturing url within text by using regex in xslt code -
this test input:
<license> <p>some text (http://creativecommons.org/licenses/by/3.0/) text.</p> </license>
desired output:
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"> <p>some text (http://creativecommons.org/licenses/by/3.0/) text.</p> </license>
basically trying copy url inside text license
element not contain attribute xlink:href="http:// ******">
looking in child <license-p>
, move url xlink:href
attribute on parent (license)
and here xslt:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/xsl/transform" xmlns:xs="http://www.w3.org/2001/xmlschema" xmlns:xlink="http://www.w3.org/1999/xlink" exclude-result-prefixes="xs" version="3.0"> <xsl:output method="html" encoding="utf-8" indent="yes" /> <xsl:strip-space elements="*"/> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="license"> <xsl:copy> <xsl:attribute name="xlink:href"> <xsl:value-of select='replace(p,"[\s\s]*" ,"(\b(?:(?:https?|ftp):\/\/|www\.|ftp\.)(?:\([-a-z0-9+&@#\/%=~_|$?!:,.]*\)|[-a-z0-9+&@#\/%=~_|$?!:,.])*(?:\([-a-z0-9+&@#\/%=~_|$?!:,.]*\)|[a-z0-9+&@#\/%=~_|$]))")'/> </xsl:attribute> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="p/@xlink:href"/> </xsl:stylesheet>
the regex using not working saxon owing characters like?
ok folks, know regex far perfect following works me:
<xsl:analyze-string select="$elvalue" regex="((https?|ftp|gopher|telnet|file):(()|(\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*\w*.\w*\w\w*\w\w*\w\d.\d\w)"> <xsl:matching-substring> <xsl:value-of select="regex-group(1)"/> </xsl:matching-substring> </xsl:analyze-string>
Comments
Post a Comment