From the course: Python for Data Science and Machine Learning Essential Training Part 1
Unlock the full course today
Join today to access over 24,400 courses taught by industry experts.
Intro to data preparation - Python Tutorial
From the course: Python for Data Science and Machine Learning Essential Training Part 1
Intro to data preparation
- [Instructor] Preparing your data for analysis is one of the most resource intensive requirements in data science. In fact, the general consensus is that data scientists spend 80% of their time on data preparation. That means the better and more efficient you become in data preparation, the more likely it is you'll be effective as a data scientist. Let's look at where data preparation falls within the typical data analytics project life cycle. The data analytics project life cycle is pretty simple. It starts off with evaluation, then you move into data preparation, then analysis and model building. Next implementation and then communication. There are six main steps involved in data preparation. Those are importing data, cleaning data, transforming data, processing data, logging data, and then backing up data. The first step is always to import the data you want to work with into your programming environment or application. And step two is cleaning data, which involves removing…