From the course: Python for Data Science and Machine Learning Essential Training Part 1
Unlock the full course today
Join today to access over 24,400 courses taught by industry experts.
Transforming data set distributions - Python Tutorial
From the course: Python for Data Science and Machine Learning Essential Training Part 1
Transforming data set distributions
- [Instructor] The term data transformation refers to the practice of changing data from its original state into a different format. This often includes turning raw data into a format that is clean and ready for use. In this coding demonstration, we're going to explore a variety of beneficial data transformations and look into those scenarios in which they're necessary. We'll focus on two specific data transformation techniques, normalization and standardization. Normalization, also known as min-max scaling is a method where data values are adjusted and scaled to fall within a range of zero to one. This technique maintains the original distribution of values without altering their ranges. On the other hand, standardization is a technique that re-scales data so that it has a mean value of zero and a standard deviation of one. This effectively normalizes the distribution of the data. Keep in mind that in machine learning, not every data set necessitates normalization. It's only required…