javascript - Regex: Match string with substrings with the same pattern -
i'm trying match string pattern, can have sub strings same pattern.
here's example string:
nicaragua [[note|note|congo member of iccrom 1999 , nicaragua 1971. both suspended iccrom general assembly in november 2013 having omitted pay contributions 6 consecutive calendar years (iccrom [[statutes|s|url|www.iccrom.org/about/statutes/]], article 9).]]. [[link|url|google.com]] might appear.
and here's pattern:
[[display_text|code|type|content]]
so, want string within brackets, , more string match pattern within top level one.
and want match this:
- [[note|s|note|congo member of iccrom 1999 , nicaragua 1971. both suspended iccrom general assembly in november 2013 having omitted pay contributions 6 consecutive calendar years (iccrom [[statutes|s|url|www.iccrom.org/about/statutes/]], article 9).]]
1.1 [[statutes|s|url|www.iccrom.org/about/statutes/]]
- [[link|s|url|google.com]]
i using /(\[\[.*]])/
gets until last ]]
.
what want able identify matched string , convert them html elements, |note|
going blockquote tag , |url|
a
tag. so, blockquote tag can have link tag inside it.
btw, i'm using coffeescript that.
thanks in advance.
in general, regex not @ dealing nested expressions. if use greedy patterns, they'll match much, , if use non-greedy patterns, @bjfletcher suggests, they'll match little, stopping inside outer content. "traditional" approach here token-based parser, step through characters 1 one , build abstract syntax tree (ast) reformat desired.
one hacky approach i've used here convert string json string, , let json parser hard work of converting nested objects: http://jsfiddle.net/t09q783d/1/
function topoormansast(s) { // escape double-quotes, they'll cause problems otherwise. converts them // unicode, safe json parsing. s = s.replace(/"/g, "\u0022"); // transform json string! s = // wrap in array delimiters ('["' + s + '"]') // replace token starts .replace(/\[\[([^\|]+)\|([^\|]+)\|([^\|]+)\|/g, '",{"display_text":"$1","code":"$2","type":"$3","content":["') // replace token ends .replace(/\]\]/g, '"]},"'); return json.parse(s); }
this gives array of strings , structured objects, can run through formatter spit out html you'd like. formatter left exercise user :).
Comments
Post a Comment