Register Login

Calculating Euclidean Distance with NumPy

In mathematics, the Euclidean distance is the smallest distance or the length between two points. We can calculate this from the Cartesian coordinates of any given set of points by implementing the Pythagorean Theorem. That is the reason why Euclidean distance is also seldom called the Pythagorean distance. In this article, you will learn the different ways of finding Euclidean distance with the use of the NumPy library.

Different ways of Calculating Euclidean Distance:

Finding the Euclidean Distance using a Python program makes it easy and saves time. But there are different approaches through which one can find the Euclidean Distance. Here we will discuss the three different approaches.

Method 1: Using the dot() method:

Python's NumPy module comes with a lot of methods to perform different mathematical and cartesian calculations. One of them is the dot() method. It performs the dot product of two NumPy arrays. If both the arrays 'x' and 'y' are two dimensional arrays, then the dot() function will execute the matrix multiplication.

Program:

import numpy as np
firstPoint = np.array((2, 4, 6))
secPoint = np.array((2, 1, 2))
leng = firstPoint - secPoint

sumSq = np.dot(leng.T, leng)
print('Euclidean Distance: ',np.sqrt(sumSq))

Output:

Euclidean Distance:  5.0

Explanation:

Here, we have to import the NumPy module with an alias name np. Next, we have created two NumPy arrays (firstPoint, and secPoint) with three values. Next we subtract secPoint from firstPoint and stored it in leng. Next we have used the np.dot() method and passed the leng.T (transpose) and leng as two separate parameters for doing the dot product.

We store the result of this dot product in the sumSq variable. Finally, we use the np.sqrt() method to calculate the square root the sumSq value and display it using the print().

Method 2: Using linalg.norm() Method:

The linalg.norm() is another NumPy method that helps in calculating one of the 8 distinct matrix norms or one of the infinite vector norms. This depends on the value of the ord parameter.

The syntax is:

linalg.norm(x_arr, ord = None, axis = None, keep_dims = False), where
  • x_arr: It is an input array.
  • ord: Abbreviated, as “order” is used to set the different orders of the norm, are given below:
None Frobenius norm
fro Frobenius norm
nuc nuclear norm
inf max(sum(abs(x), axis=1)
-inf min(sum(abs(x), axis=1)), etc.
  • axis: If the axis value becomes an integer, then the vector norm gets computed for the x axis. If the axis is a 2-tuple, it computes the matrix norms of particular matrices.
  • keep_dims: It accepts a Boolean value only. If the value becomes​ true, the axes get normed over are left and results in dimensions having size one. Otherwise, the axes which get normed remained in the result.

Program:

import numpy as np
firstPoint = np.array((2, 4, 6))
secPoint = np.array((2, 1, 2))
leng = np.linalg.norm(firstPoint - secPoint)
print('Euclidean Distance: ',leng)

Explanation:

Here, we have to import the NumPy module with an alias name np. Next, we have created two NumPy arrays (firstPoint and secPoint) with three values within it. Then, we have to use the np.linalg.norm() method and passed the subtraction of firstPoint, and secPoint within it as parameter. We have to store that calculated data to a variable name leng. Lastly we print the leng variable using the print() function.

Method 3: Using sum() and square() combine:

The sum() and the square() methods are two commonly used methods of the NumPy module. They help in summing up all the numbers passed within it as a parameter. Again, the square() method is used to square up the number residing within it as its parameter.

Program:

import numpy as np
firstPoint = np.array((2, 4, 6))
secPoint = np.array((2, 1, 2))
sumSq = np.sum(np.square(firstPoint - secPoint))
print('Euclidean Distance: ',np.sqrt(sumSq))

Explanation:

Here, we have to import the NumPy module with an alias name np. Next, we have created two NumPy arrays (firstPoint and secPoint) with three values within it. Then, we have to use the sum() method and inside it we have to use the square() method which will take the subtraction of secPoint from firstPoint and stored it in sumSq variable. Finally, we use the np.sqrt() method to calculate the square root the sumSq value and display it using the print().

Method 4: using the math.dist() function:

We can also use the math module as an alternative to all the above techniques. The math module has the dist() function that can return the line segment connecting two points of the Euclidean distance.

Program:

from math import dist
firstPoint = (2, 4, 6)
secPoint = (1, 1, 1)
print('Euclidean Distance: ',dist(firstPoint, secPoint))

Output:

Euclidean Distance:  5.916079783099616

Explanation:

Here we have to explicitly import the dist from the math module. Then, we will create two tuples with three different values. Next, we will use the dist() function and pass the two tuples which are the two points whose differentiation will give us the Euclidean distance. We, finally display that Euclidean distance using the print().

Conclusion:

Among all these, the fourth method is the simplest and most efficient way of finding the Euclidean distance. This is because, the tuples (created in the fourth technique) take less time (time complexity is less) to process. Also, the math module is easy to use. The next most efficient method is the dot() method (first technique) and uses the NumPy arrays which works faster. But, some statisticians, data analysts, and programmers also prefer to go with the second and third techniques also.