From the course: NLP with Python for Machine Learning Essential Training

Unlock the full course today

Join today to access over 24,400 courses taught by industry experts.

Introducing vectorizing

Introducing vectorizing

- [Instructor] Up to this point, we've discussed several times how Python really only sees a string of characters when it looks at text data. So now that we've learned how to clean up that text data that we'll be using to build the machine learning model, now we have to learn how to get that text into a form that a machine learning model and Python can actually use to understand and train a model. The process that we use to convert text to a form that Python and a machine learning model can understand is called vectorizing. This is defined as the process of encoding text as integers to create feature vectors. Now if you don't have much machine learning experience, you may be wondering what a feature vector is. A feature vector is an n-dimensional vector of numerical features that represent some object. So in our context, that means we'll be taking an individual text message and converting it to a numeric vector that represents that text message. How exactly we do that is what we're…

Contents