Zum Inhalt springen

Lemmatization

Lemmatization is the process of reducing a word to its base or dictionary form (lemma) by considering its context, part of speech, and using linguistic rules.

from nltk.stem import WordNetLemmatizer,PorterStemmer
stem=PorterStemmer()
lem=WordNetLemmatizer()

print(stem.stem('change'))
print(stem.stem('changes'))
print(stem.stem('changed'))
Output:
chang
chang
chang

In stemming, it doesn’t the meaning of word or context of word, it simply changed the suffix of ‚change‘ and made word ‚chang‘ which doesn’t make any sense.

print(lem.lemmatize('change'))
print(lem.lemmatize('changes'))
print(lem.lemmatize('changed'))
Output:
change
change
changed

Only ‚Changes‘ changed because in dictionary there no word ‚changes‘ and changed obviously it is in past tense. ‚Change‘ word is core word.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert