What is tokenization and lemmatization in NLP?

Question

Lemmatization generally means to do things properly with the use of vocabulary and morphological analysis of words. In this process, the endings of the words are removed to return the base word, which is also known as Lemma.

Example: boy’s = boy, cars= car, colors= color.

So, the main concept of Lemmatization as well as of stemming is to identify and return the root words of the sentence to explore various additional information.

 

Tokenization in NLP means the method of dividing the text into various tokens. You can think of a token in the form of the word. Just like a word forms into a sentence. It is an important step in NLP to slit the text into minimal units.

0
m_prth 3 years 0 Answers 712 views Member 0

Leave an answer

Browse
Browse