You can cluster spatial latitude-longitude data with scikit-learn’s DBSCAN without precomputing a distance matrix.
db = DBSCAN(eps=2/6371., min_samples=5, algorithm='ball_tree', metric="haversine").fit(np.radians(coordinates))
This comes from this tutorial on clustering spatial data with scikit-learn DBSCAN. In particular, notice that the eps
value is still 2km, but it’s divided by 6371 to convert it to radians. Also, notice that .fit()
takes the coordinates in radian units for the haversine metric.