The difference lies in when you pass as x
data that is larger than one batch.
predict
will go through all the data, batch by batch, predicting labels.
It thus internally does the splitting in batches and feeding one batch at a time.
predict_on_batch
, on the other hand, assumes that the data you pass in is exactly one batch and thus feeds it to the network. It won’t try to split it (which, depending on your setup, might prove problematic for your GPU memory if the array is very big)