From the course: Machine Learning and AI Foundations: Prediction, Causation, and Statistical Inference

Unlock the full course today

Join today to access over 24,400 courses taught by industry experts.

Train/Test: What can go wrong?

Train/Test: What can go wrong?

- When folks are trained in statistics first and then transition into machine learning, they often ask where are all the p-values? P-values don't really work when it comes to machine learning. Although Jacob Cohen's critique was really about statistics and not about machine learning, it certainly cast doubt on machine learning in general when it comes to p-values. It certainly makes it clear that p-values cannot protect us at scale. As you start to get dozens or hundreds of variables, the p-values simply can't work. The risk of a false positive becomes so high as to be a certainty. So we need another way and if you've done any machine learning modeling, you're probably already familiar with the alternative. You randomly divide your data into two halves. Sometimes you'll meet folks that prefer to have three segments, but it's most often two, and this is called holdout validation. Then you build the model on one half…

Contents