From the course: Build Three Real-World Python Applications

Unlock the full course today

Join today to access over 24,400 courses taught by industry experts.

How to clean text data

How to clean text data

- Many times when you bring in text data to analyze, it is very messy. It may contain misspelled words, words you don't want to analyze like the and and, or contain texts like chapter numbers that don't provide you value. I will show you how to clean text data in three different ways for our analysis. Remember that you will need to execute the code at the top of each file from here onwards in order to properly execute the code for each video. For our word cloud we will do a simple cleaning of our text to visualize our popular words. We will create a variable called word_cloud_text and equal it to that great_expect variable we had earlier and use the .lower method on it. This will lowercase all texts to make sure a capital soon is the same as a lowercase soon. Now we will type in our word_cloud_text variable again and equal it to this code where it allows us to remove any numbers in alpha numeric words that we don't need…

Contents