I am not sure whether "norm" and "Euclidean distance" mean the same thing. Please could you help me with this distinction.
我不确定“norm”和“Euclidean distance”是否意味着相同的东西。请您帮我区分一下。
I have an n
by m
array a
, where m
> 3. I want to calculate the Eculidean distance between the second data point a[1,:]
to all the other points (including itself). So I used the np.linalg.norm
, which outputs the norm of two given points. But I don't know if this is the right way of getting the EDs.
我有一个n×m的数组a, m > 3。我想计算第二个数据点a[1,:]到所有其他点(包括它自己)之间的Eculidean距离。所以我用了np。linalg。norm,它输出两个给定的点的标准。但我不知道这是否是获得EDs的正确方法。
import numpy as np
a = np.array([[0, 0, 0 ,0 ], [1, 1 , 1, 1],[2,2, 2, 3], [3,5, 1, 5]])
N = a.shape[0] # number of row
pos = a[1,:] # pick out the second data point.
dist = np.zeros((N,1), dtype=np.float64)
for i in range(N):
dist[i]= np.linalg.norm(a[i,:] - pos)
2 个解决方案
#1
16
A norm is a function that takes a vector as an input and returns a scalar value that can be interpreted as the "size", "length" or "magnitude" of that vector. More formally, norms are defined as having the following mathematical properties:
norm是一个函数,它将一个向量作为输入,并返回一个标量值,它可以被解释为该向量的“大小”、“长度”或“大小”。更正式地,规范被定义为具有以下数学性质:
- They scale multiplicatively, i.e. Norm(a·v) = |a|·Norm(v) for any scalar a
- 它们相乘,即Norm(a·v) = |a|·Norm(v)对于任何标量a。
- They satisfy the triangle inequality, i.e. Norm(u + v) ≤ Norm(u) + Norm(v)
- 他们满足三角不等式,即规范(u + v)≤规范(u)+规范(v)
- The norm of a vector is zero if and only if it is the zero vector, i.e. Norm(v) = 0 ⇔ v = 0
- 矢量的范数是零当且仅当它是零向量,即规范(v)= 0⇔v = 0
The Euclidean norm (also known as the L² norm) is just one of many different norms - there is also the max norm, the Manhattan norm etc. The L² norm of a single vector is equivalent to the Euclidean distance from that point to the origin, and the L² norm of the difference between two vectors is equivalent to the Euclidean distance between the two points.
欧几里得范数(也称为L²规范)是许多不同的规范之一——还有马克斯规范,曼哈顿规范等。L²标准相当于一个向量的欧氏距离,指向原点,和L²规范之间的区别两个向量之间的欧几里得距离等于两个点。
As @nobar's answer says, np.linalg.norm(x - y, ord=2)
(or just np.linalg.norm(x - y)
) will give you Euclidean distance between the vectors x
and y
.
@nobar的回答是,np.linalg。norm(x - y, ord=2)(或只是np.linalg)。范数(x - y)将给出向量x和y之间的欧几里得距离。
Since you want to compute the Euclidean distance between a[1, :]
and every other row in a
, you could do this a lot faster by eliminating the for
loop and broadcasting over the rows of a
:
因为你想要计算a[1,]和a之间的每一行之间的欧氏距离,你可以通过消除for循环和对a的行进行广播来做得更快。
dist = np.linalg.norm(a[1:2] - a, axis=1)
It's also easy to compute the Euclidean distance yourself using broadcasting:
用广播来计算欧几里得距离也很简单:
dist = np.sqrt(((a[1:2] - a) ** 2).sum(1))
The fastest method is probably scipy.spatial.distance.cdist
:
最快的方法可能是scip . space .distance.cdist:
from scipy.spatial.distance import cdist
dist = cdist(a[1:2], a)[0]
Some timings for a (1000, 1000) array:
a(1000, 1000)阵列的一些计时:
a = np.random.randn(1000, 1000)
%timeit np.linalg.norm(a[1:2] - a, axis=1)
# 100 loops, best of 3: 5.43 ms per loop
%timeit np.sqrt(((a[1:2] - a) ** 2).sum(1))
# 100 loops, best of 3: 5.5 ms per loop
%timeit cdist(a[1:2], a)[0]
# 1000 loops, best of 3: 1.38 ms per loop
# check that all 3 methods return the same result
d1 = np.linalg.norm(a[1:2] - a, axis=1)
d2 = np.sqrt(((a[1:2] - a) ** 2).sum(1))
d3 = cdist(a[1:2], a)[0]
assert np.allclose(d1, d2) and np.allclose(d1, d3)
#2
3
The concept of a "norm" is a generalized idea in mathematics which, when applied to vectors (or vector differences), broadly represents some measure of length. There are various different approaches to computing a norm, but the one called Euclidean distance is called the "2-norm" and is based on applying an exponent of 2 (the "square"), and after summing applying an exponent of 1/2 (the "square root").
“范数”的概念是数学中的一个广义概念,当应用到向量(或向量差异)时,广义地表示某种长度的度量。有很多不同的方法来计算一个标准,但是一个叫做欧几里得距离的方法叫做“2-范数”,它的基础是应用2的指数(“平方”),并且在求和之后应用1/2的指数(“平方根”)。
It's a bit cryptic in the docs, but you get Euclidean distance between two vectors by setting the parameter ord=2
.
它在文档中有点神秘,但是通过设置参数ord=2,可以得到两个向量之间的欧氏距离。
sum(abs(x)**ord)**(1./ord)
总和(abs(x)* *奥德)* *(1. /奥德)
becomes sqrt(sum(x**2))
.
变成了sqrt(sum(x * * 2))。
Note: as pointed out by @Holt, the default value is ord=None
, which is documented to compute the "2-norm" for vectors. This is, therefore, equivalent to ord=2
(Euclidean distance).
注意:正如@Holt指出的,默认值是ord=None,它被记录为计算向量的“2-范数”。因此,这等价于ord=2(欧几里得距离)。
#1
16
A norm is a function that takes a vector as an input and returns a scalar value that can be interpreted as the "size", "length" or "magnitude" of that vector. More formally, norms are defined as having the following mathematical properties:
norm是一个函数,它将一个向量作为输入,并返回一个标量值,它可以被解释为该向量的“大小”、“长度”或“大小”。更正式地,规范被定义为具有以下数学性质:
- They scale multiplicatively, i.e. Norm(a·v) = |a|·Norm(v) for any scalar a
- 它们相乘,即Norm(a·v) = |a|·Norm(v)对于任何标量a。
- They satisfy the triangle inequality, i.e. Norm(u + v) ≤ Norm(u) + Norm(v)
- 他们满足三角不等式,即规范(u + v)≤规范(u)+规范(v)
- The norm of a vector is zero if and only if it is the zero vector, i.e. Norm(v) = 0 ⇔ v = 0
- 矢量的范数是零当且仅当它是零向量,即规范(v)= 0⇔v = 0
The Euclidean norm (also known as the L² norm) is just one of many different norms - there is also the max norm, the Manhattan norm etc. The L² norm of a single vector is equivalent to the Euclidean distance from that point to the origin, and the L² norm of the difference between two vectors is equivalent to the Euclidean distance between the two points.
欧几里得范数(也称为L²规范)是许多不同的规范之一——还有马克斯规范,曼哈顿规范等。L²标准相当于一个向量的欧氏距离,指向原点,和L²规范之间的区别两个向量之间的欧几里得距离等于两个点。
As @nobar's answer says, np.linalg.norm(x - y, ord=2)
(or just np.linalg.norm(x - y)
) will give you Euclidean distance between the vectors x
and y
.
@nobar的回答是,np.linalg。norm(x - y, ord=2)(或只是np.linalg)。范数(x - y)将给出向量x和y之间的欧几里得距离。
Since you want to compute the Euclidean distance between a[1, :]
and every other row in a
, you could do this a lot faster by eliminating the for
loop and broadcasting over the rows of a
:
因为你想要计算a[1,]和a之间的每一行之间的欧氏距离,你可以通过消除for循环和对a的行进行广播来做得更快。
dist = np.linalg.norm(a[1:2] - a, axis=1)
It's also easy to compute the Euclidean distance yourself using broadcasting:
用广播来计算欧几里得距离也很简单:
dist = np.sqrt(((a[1:2] - a) ** 2).sum(1))
The fastest method is probably scipy.spatial.distance.cdist
:
最快的方法可能是scip . space .distance.cdist:
from scipy.spatial.distance import cdist
dist = cdist(a[1:2], a)[0]
Some timings for a (1000, 1000) array:
a(1000, 1000)阵列的一些计时:
a = np.random.randn(1000, 1000)
%timeit np.linalg.norm(a[1:2] - a, axis=1)
# 100 loops, best of 3: 5.43 ms per loop
%timeit np.sqrt(((a[1:2] - a) ** 2).sum(1))
# 100 loops, best of 3: 5.5 ms per loop
%timeit cdist(a[1:2], a)[0]
# 1000 loops, best of 3: 1.38 ms per loop
# check that all 3 methods return the same result
d1 = np.linalg.norm(a[1:2] - a, axis=1)
d2 = np.sqrt(((a[1:2] - a) ** 2).sum(1))
d3 = cdist(a[1:2], a)[0]
assert np.allclose(d1, d2) and np.allclose(d1, d3)
#2
3
The concept of a "norm" is a generalized idea in mathematics which, when applied to vectors (or vector differences), broadly represents some measure of length. There are various different approaches to computing a norm, but the one called Euclidean distance is called the "2-norm" and is based on applying an exponent of 2 (the "square"), and after summing applying an exponent of 1/2 (the "square root").
“范数”的概念是数学中的一个广义概念,当应用到向量(或向量差异)时,广义地表示某种长度的度量。有很多不同的方法来计算一个标准,但是一个叫做欧几里得距离的方法叫做“2-范数”,它的基础是应用2的指数(“平方”),并且在求和之后应用1/2的指数(“平方根”)。
It's a bit cryptic in the docs, but you get Euclidean distance between two vectors by setting the parameter ord=2
.
它在文档中有点神秘,但是通过设置参数ord=2,可以得到两个向量之间的欧氏距离。
sum(abs(x)**ord)**(1./ord)
总和(abs(x)* *奥德)* *(1. /奥德)
becomes sqrt(sum(x**2))
.
变成了sqrt(sum(x * * 2))。
Note: as pointed out by @Holt, the default value is ord=None
, which is documented to compute the "2-norm" for vectors. This is, therefore, equivalent to ord=2
(Euclidean distance).
注意:正如@Holt指出的,默认值是ord=None,它被记录为计算向量的“2-范数”。因此,这等价于ord=2(欧几里得距离)。