Add the Text Processing -> Stemming -> Stem (Dictionary) operator, and choose your dictionary file (plain text).
Your format should be like this:
stem:inflection
stem:inflection
example:
fish:fished
will turn fished into fish.
You can also use wildcards:
fish:fish.*
will turn fished, fishes, fishing or anything beginning with fish into fish.
You should put longer versions of similar words at the top. For example, to stem these words correctly:
computer, computerise, computerize, computerized, computerised, computers, compute, computed, computes
You should use
computer:computer.*
compute:compute.*
and not
compute:compute.*
computer:computer.*
assuming computer and compute are not the same stem.
No comments:
Post a Comment