# SECTION 2 * you need to split the training set and a test set to balance the machine learning (you train on the test set and test those assumptions on the test set) ? what is categorical data, why whould you use it?