javascript - Regex: Match string with substrings with the same pattern -


i'm trying match string pattern, can have sub strings same pattern.

here's example string:

nicaragua [[note|note|congo member of iccrom 1999 , nicaragua 1971. both suspended iccrom general assembly in november 2013 having omitted pay contributions 6 consecutive calendar years (iccrom [[statutes|s|url|www.iccrom.org/about/statutes/]], article 9).]]. [[link|url|google.com]] might appear.

and here's pattern:

[[display_text|code|type|content]] 

so, want string within brackets, , more string match pattern within top level one.

and want match this:

  1. [[note|s|note|congo member of iccrom 1999 , nicaragua 1971. both suspended iccrom general assembly in november 2013 having omitted pay contributions 6 consecutive calendar years (iccrom [[statutes|s|url|www.iccrom.org/about/statutes/]], article 9).]]

1.1 [[statutes|s|url|www.iccrom.org/about/statutes/]]

  1. [[link|s|url|google.com]]

i using /(\[\[.*]])/ gets until last ]].

what want able identify matched string , convert them html elements, |note| going blockquote tag , |url| a tag. so, blockquote tag can have link tag inside it.

btw, i'm using coffeescript that.

thanks in advance.

in general, regex not @ dealing nested expressions. if use greedy patterns, they'll match much, , if use non-greedy patterns, @bjfletcher suggests, they'll match little, stopping inside outer content. "traditional" approach here token-based parser, step through characters 1 one , build abstract syntax tree (ast) reformat desired.

one hacky approach i've used here convert string json string, , let json parser hard work of converting nested objects: http://jsfiddle.net/t09q783d/1/

function topoormansast(s) {     // escape double-quotes, they'll cause problems otherwise. converts them     // unicode, safe json parsing.     s = s.replace(/"/g, "\u0022");     // transform json string!     s =         // wrap in array delimiters         ('["' + s + '"]')         // replace token starts         .replace(/\[\[([^\|]+)\|([^\|]+)\|([^\|]+)\|/g,              '",{"display_text":"$1","code":"$2","type":"$3","content":["')         // replace token ends         .replace(/\]\]/g, '"]},"');      return json.parse(s); } 

this gives array of strings , structured objects, can run through formatter spit out html you'd like. formatter left exercise user :).


Comments

Popular posts from this blog

python - How to create jsonb index using GIN on SQLAlchemy? -

PHP DOM loadHTML() method unusual warning -

c# - TransactionScope not rolling back although no complete() is called -