From the course: Intro to Snowflake for Devs, Data Scientists, Data Engineers

Unlock this course with a free trial

Join today to access over 24,600 courses taught by industry experts.

Snowpark ML modeling: Part 3

Snowpark ML modeling: Part 3

- Now we're ready to split our data into a training set and a test set and move on to our final objective; training our model and seeing how well it performs. For those of you who don't have a background in machine learning, splitting our data into a training set and a test set just means that we're holding back some of the data so that it's not used to train the model. That's what we call the test set. This is important because having a clean test set lets us check how good our model is by having to make predictions based on the test set and seeing how correct those predictions are. We don't want to train the model on the same data we're going to test it on because that's like letting it prepare for the test by giving it the answers in advance. So we're getting this random split from Snowpark ML, even though the underlying functionality was adopted from Psyche learn. We'll set aside 10% of the data, so two of our 20 years worth of truck location data, for testing. Great, now we'll…

Contents