From the course: Python for Data Science and Machine Learning Essential Training Part 1

Unlock the full course today

Join today to access over 24,400 courses taught by industry experts.

Transforming data set distributions

Transforming data set distributions

- [Instructor] The term data transformation refers to the practice of changing data from its original state into a different format. This often includes turning raw data into a format that is clean and ready for use. In this coding demonstration, we're going to explore a variety of beneficial data transformations and look into those scenarios in which they're necessary. We'll focus on two specific data transformation techniques, normalization and standardization. Normalization, also known as min-max scaling is a method where data values are adjusted and scaled to fall within a range of zero to one. This technique maintains the original distribution of values without altering their ranges. On the other hand, standardization is a technique that re-scales data so that it has a mean value of zero and a standard deviation of one. This effectively normalizes the distribution of the data. Keep in mind that in machine learning, not every data set necessitates normalization. It's only required…

Contents