You need lots of data. You split the data in two parts. One part is for training. The other part is for testing. The model learns patterns from the training data. Then you check its answers with the test data to see if it is good, eh?