How to get a non-shuffled train_test_split in sklearn

I’m not adding much to Psidom’s answer except an easy to copy paste function:

def non_shuffling_train_test_split(X, y, test_size=0.2):
    i = int((1 - test_size) * X.shape[0]) + 1
    X_train, X_test = np.split(X, [i])
    y_train, y_test = np.split(y, [i])
    return X_train, X_test, y_train, y_test

Update:
At some point this feature became built in, so now you can do:

from sklearn.model_selection import train_test_split
train_test_split(X, y, test_size=0.2, shuffle=False)

Leave a Comment

tech