I think it's because of the default token pattern of the tf-idf vectorizer that considers a word as…

Itaru Kishikawa

1 min readJun 3, 2020

I think it's because of the default token pattern of the tf-idf vectorizer that considers a word as a token only if it has atleast two alphanumeric characters.

Written by Aditya Rastogi

170 Followers

Interested in learning about computations that make perception, reasoning and action possible.

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams