You can use pandas dataframes or series as Julien said but if you want to restrict your-self to numpy you can pass an additional array of indices:
from sklearn.model_selection import train_test_split
import numpy as np
n_samples, n_features, n_classes = 10, 2, 2
data = np.random.randn(n_samples, n_features) # 10 training examples
labels = np.random.randint(n_classes, size=n_samples) # 10 labels
indices = np.arange(n_samples)
(
data_train,
data_test,
labels_train,
labels_test,
indices_train,
indices_test,
) = train_test_split(data, labels, indices, test_size=0.2)