If you put the Extract Information operator inside a Process Documents operator, it will add a column to your dataset with the results of the match. Turn on "add meta information" option on the Process Documents operator.
Here's a simple regular expression to find a word near another word:
(word1\W+(?:\w+\W+){1,max}?word2)
this will produce a match if "word1" has no more than "max" words between it and "word2". Example:
"The quick brown fox jumped over the lazy dog"
(quick\W+(?:\w+\W+){1,5}?lazy)
will match, but(quick\W+(?:\w+\W+){1,5}?dog)
will not (it's has 6 words in between)
No comments:
Post a Comment