Aditya Rastogi
1 min readJun 3, 2020

--

I think it's because of the default token pattern of the tf-idf vectorizer that considers a word as a token only if it has atleast two alphanumeric characters.

--

--

Aditya Rastogi

Interested in learning about computations that make perception, reasoning and action possible.