Compute pairwise distance in a batch without replicating tensor in Tensorflow?

Question

You can use some linear algebra to turn it into matrix ops. Note that what you need matrix D where a[i] is the ith row of your original matrix and

D[i,j] = (a[i]-a[j])(a[i]-a[j])'

You can rewrite that into

D[i,j] = r[i] - 2 a[i]a[j]' + r[j]

Where r[i] is squared norm of ith row of the original matrix.

In a system that supports standard broadcasting rules you can treat r as a column vector and write D as

D = r - 2 A A' + r'

In TensorFlow you could write this as

A = tf.constant([[1, 1], [2, 2], [3, 3]])
r = tf.reduce_sum(A*A, 1)

# turn r into column vector
r = tf.reshape(r, [-1, 1])
D = r - 2*tf.matmul(A, tf.transpose(A)) + tf.transpose(r)
sess = tf.Session()
sess.run(D)

result

array([[0, 2, 8],
       [2, 0, 2],
       [8, 2, 0]], dtype=int32)

Leave a Comment Cancel reply