From the course: Build Three Real-World Python Applications

Unlock the full course today

Join today to access over 24,400 courses taught by industry experts.

How to perform topic modeling

How to perform topic modeling

- [Instructor] Now that our dictionary and corpus are created, we can begin building our topic model. We will use the Latent Dirichlet Allocation topic model, also known as LDA. This algorithm will help us figure out what topics are common in our novel. In order to do this analysis, we need to determine what the optimal number of topics is. We will begin this by setting our variable np.random.seed to 1. So again, this way we get the same outcome each time. Now we will create our variable K_range to look at the values in the range of 6 to 20 for every second value. Remember, the range function will not include the last value which in this case is 20, so it will actually be looking at the topics ranging from 6 to 18, going every two. There is no exact science for picking the number of topics, so in this case we are making a personal choice to not have less than six topics, since we usually want to have more than…

Contents