python - regex search left to right? -


>>> test1 = "123 main street, slc, utah county, utah 84115"  # test string >>> address_end_pattern3 = re.compile(r"\b((ut(ah)?)\,?[\s\n](84\d{3}(\-\d{4})?)|(84\d{3}(\-\d{4})?)|(ut(ah)?))\b", re.ignorecase) # 3 patterns concatenated | in order found >>> address_end_pattern2 = re.compile(r"\b((ut(ah)?)\,?[\s\n](84\d{3}(\-\d{4})?)|(84\d{3}(\-\d{4})?))\b", re.ignorecase)  # 2 patterns omitting state pattern >>> address_end_pattern1 = re.compile(r"\b(ut(ah)?)\,?[\s\n](84\d{3}(\-\d{4})?)\b", re.ignorecase)  # first pattern (state , zip) alone >>> address_end_pattern1.search(test1).group() 'utah 84115'  # finds first pattern correctly when pattern >>> address_end_pattern3.search(test1).group()  # not when state pattern there 'utah' >>> address_end_pattern2.search(test1).group() 'utah 84115'  # finds first pattern when combined zip alone 

after previous question confirmed it, believe regex searches string , pattern left right... happened. if finds pattern correctly itself, , when concatenated zip pattern, why find state pattern when last option in concatenated pattern? can explain behavior?

edit: clarity, pattern best indicator of end of address if first pattern:

r"\b(ut(ah)?)\,?[\s\n](84\d{3}(\-\d{4})?)\b" # re.ignorecase 

i trying identify like: ut, 84115 or utah, 84115-0001

if doesn't occur, zip code next best option identify end of address:

r"\b(84\d{3}(\-\d{4})?))\b" 

which should match like:

84115 or 84115-0011

then finally, if neither matches want state:

\b(ut(ah)?)\b

which should match: ut or utah

i want find in order because last 2 might either cut off information or in various cases consume second address might listed because address listed as:

1234 main st, slc ut , 1235 main st, slc ut 84115

the regex matches utah in utah county due 3rd option in 3rd pattern. since comes before desired "utah 84115", that's first match, utah 84115 being second. if switch around "utah 84115" , "utah county", works. https://regex101.com/r/zq4rj1/5 .


Comments

Popular posts from this blog

PHP DOM loadHTML() method unusual warning -

python - How to create jsonb index using GIN on SQLAlchemy? -

c# - TransactionScope not rolling back although no complete() is called -