匹配集合x,y指向另一个被缩放、旋转、翻译和缺少元素的集合。

(Why am I doing this? See explanation below)

(我为什么要这么做?)见下面的解释)

Consider two sets of points, A and B as shown below

考虑两个点集，A和B，如下所示

It might not look like it, but set A is "hidden" within set B. It can not be easily seen because points in B are scaled, rotated, and translated in (x, y) with respect to A. Even worse, some points that are present in A are missing in B, and B contains many points that are not in A.

看起来可能不喜欢它,但是设置一个“隐藏”在B .它不能轻易看到因为点B是缩放,旋转,和翻译(x,y)对A。更糟糕的是,一些点在B的人失踪,和B包含很多点都没有。

I need to find the appropriate scaling, rotation, and translation that must be applied to the B set in order to match it with set A. In the case shown above, the correct values are:

我需要找到必须应用于B集的适当的缩放、旋转和平移，以便与集合a匹配。在上面的例子中，正确的值是:

scale = 0.14, rot_angle = 0.0, x_transl = 35.0, y_transl = 2.0

which produces the (good enough) match

哪个产生(足够好的)匹配

(in red, only the matched B points are shown; these are located in the sector 1000<x<2000, y~2000 in the first figure to the right). Given the many degrees of freedom (DoF: scaling + rotation + 2D translation) I am aware of the possibility of a miss-match, but the coordinates of the points are not random (although they may look like it) so the probability of this happening is very small.

(用红色表示，只显示匹配的B点;它们位于第一个图右边的1000

The code I wrote (see below) uses brute force to loop through all possible DoF values taken from pre-defined ranges for each one. The core of the code is based on minimizing the distance of each point in A to any point in B

我编写的代码(参见下面)使用蛮力来遍历所有可能的DoF值，这些值都是从预定义的范围中提取出来的。代码的核心是最小化A点到B点的距离

The code works (it actually generated the solution mentioned above), but since the number of solutions (i.e., the combinations of accepted values for each DoF) scales with larger ranges, it can become unacceptably slow pretty quick (also it eats up all the RAM in my system)

代码可以工作(它实际上生成了上面提到的解决方案)，但是由于解决方案的数量(例如。，对于每个DoF的可接受值的组合)具有更大的范围，它会变得非常快，慢得令人无法接受(而且它会消耗我系统中的所有RAM)

How can I improve the performance of the code? I'm open to any solution including numpy and/or scipy. Perhaps something like Basing-Hopping to search for the best match (or a relatively close one) instead of the brute force method I currently employ?

如何改进代码的性能?我愿意接受任何解决方案，包括numpy和/或scipy。也许是像跳基底——寻找最好的匹配(或者一个相对接近的)而不是我目前使用的蛮力方法?

import numpy as np
from scipy.spatial import distance
import math


def scalePoints(B_center, delta_x, delta_y, scale):
    """
    Scales xy points.

    http://codereview.stackexchange.com/q/159183/35351
    """
    x_scale = B_center[0] - scale * delta_x
    y_scale = B_center[1] - scale * delta_y

    return x_scale, y_scale


def rotatePoints(center, x, y, angle):
    """
    Rotates points in 'xy' around 'center'. Angle is in degrees.
    Rotation is counter-clockwise

    http://*.com/a/20024348/1391441
    """
    angle = math.radians(angle)
    xy_rot = x - center[0], y - center[1]
    xy_rot = (xy_rot[0] * math.cos(angle) - xy_rot[1] * math.sin(angle),
              xy_rot[0] * math.sin(angle) + xy_rot[1] * math.cos(angle))
    xy_rot = xy_rot[0] + center[0], xy_rot[1] + center[1]

    return xy_rot


def distancePoints(set_A, x_transl, y_transl):
    """
    Find the sum of the minimum distance of points in set_A to points in set_B.
    """
    d = distance.cdist(set_A, zip(*[x_transl, y_transl]), 'euclidean')
    # Sum of all minimal distances.
    d_sum = np.sum(np.min(d, axis=1))

    return d_sum


def match_frames(
        set_A, B_center, delta_x, delta_y, tol, sc_min, sc_max, sc_step,
        ang_min, ang_max, ang_step, xmin, xmax, xstep, ymin, ymax, ystep):
    """
    Process all possible solutions in the defined ranges.
    """
    N = len(set_A)
    # Ranges
    sc_range = np.arange(sc_min, sc_max, sc_step)
    ang_range = np.arange(ang_min, ang_max, ang_step)
    x_range = np.arange(xmin, xmax, xstep)
    y_range = np.arange(ymin, ymax, ystep)
    print("Total solutions: {:.2e}".format(
          np.prod([len(_) for _ in [sc_range, ang_range, x_range, y_range]])))

    d_sum, params_all = [], []
    for scale in sc_range:
        # Scaled points.
        x_scale, y_scale = scalePoints(B_center, delta_x, delta_y, scale)
        for ang in ang_range:
            # Rotated points.
            xy_rot = rotatePoints(B_center, x_scale, y_scale, ang)
            # x translation
            for x_tr in x_range:
                x_transl = xy_rot[0] + x_tr
                # y translation
                for y_tr in y_range:
                    y_transl = xy_rot[1] + y_tr

                    # Find minimum distance sum.
                    d_sum.append(distancePoints(set_A, x_transl, y_transl))

                    # Store solutions.
                    params_all.append([scale, ang, x_tr, y_tr])

                    # Condition to break out if given tolerance for match
                    # is achieved.
                    if d_sum[-1] <= tol * N:
                        print("Match found:", scale, ang, x_tr, y_tr)
                        i_min = d_sum.index(min(d_sum))
                        return i_min, params_all

        # Print best solution found so far.
        i_min = d_sum.index(min(d_sum))
        print("d_sum_min = {:.2f}".format(d_sum[i_min]))

    return i_min, params_all


# Data.
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
         [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [
    [2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
     620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
    [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
     3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]

# This is necessary to apply the scaling.
x, y = np.asarray(set_B)
# Center of B points, defined as the center of the minimal rectangle that
# contains all points.
B_center = [(min(x) + max(x)) * .5, (min(y) + max(y)) * .5]
# Difference between the center coordinates and the xy points.
delta_x, delta_y = B_center[0] - x, B_center[1] - y

# Tolerance in pixels for match.
tol = 1.
# Ranges for each DoF.
sc_min, sc_max, sc_step = .01, .2, .01
ang_min, ang_max, ang_step = -30., 30., 1.
xmin, xmax, xstep = -150., 150., 1.
ymin, ymax, ystep = -150., 150., 1.

# Find proper scaling + rotation + translation for set_B.
i_min, params_all = match_frames(
    set_A, B_center, delta_x, delta_y, tol, sc_min, sc_max, sc_step,
    ang_min, ang_max, ang_step, xmin, xmax, xstep, ymin, ymax, ystep)

# Best match found
print(params_all[i_min])

Why am I doing this?

我为什么要这么做?

When an astronomer observes a star field, it also needs to observe what is called a "standard field of stars". This is needed to be able to transform the "instrumental magnitudes" (a logarithmic measure of the brightness) of the observed stars to a common universal scale, since these magnitudes depend on the optical system used (telescope + CCD array). In the example shown here, the standard field can be seen below to the left, and the observed one to the right.

当一个天文学家观察一个星场时，它也需要观察所谓的“标准星场”。这需要能够将观测到的恒星的“仪器亮度”(一种对亮度的对数测量)转换成一个通用的通用尺度，因为这些亮度取决于所使用的光学系统(望远镜+ CCD阵列)。在这里的示例中，可以看到标准字段在左边，而在右边则可以看到。

Notice that the points in the set A (used above) are the marked stars in the standard field, and the set B are those stars detected in the observed field (marked in red above)

注意集合A中的点(上面使用的)是标准场中的标记恒星，集合B是观测场中检测到的那些恒星(上面用红色标记)

The process of identifying those stars in the observed field that correspond to the stars marked in the standard field is done by-eye, even today. This is due to the complexity of the task.

即使是在今天，在观察到的与标准领域中标记的恒星对应的恒星的过程中，也会有一个过程。这是由于任务的复杂性。

In the observed image above, there's quite a bit of scaling, but no rotation and little translation. This constitutes a rather favorable scenario; it could be much worse. I'm trying to develop a simple algorithm to avoid having to manually identify stars in the observed field as stars in the standard field, one by one.

在上面观察到的图像中，有一些缩放，但是没有旋转和很少的平移。这是一种相当有利的情况;可能会更糟。我正在尝试开发一种简单的算法，以避免必须手动地将观测场中的恒星识别为标准场中的恒星。

Solution proposed by litepresence

litepresence提出的解决方案

This is a script I made following the answer by litepresence.

这是我根据litepresence的答案制作的脚本。

import itertools
import numpy as np
import matplotlib.pyplot as plt


def getTriangles(set_X, X_combs):
    """
    Inefficient way of obtaining the lengths of each triangle's side.
    Normalized so that the minimum length is 1.
    """
    triang = []
    for p0, p1, p2 in X_combs:
        d1 = np.sqrt((set_X[p0][0] - set_X[p1][0]) ** 2 +
                     (set_X[p0][1] - set_X[p1][1]) ** 2)
        d2 = np.sqrt((set_X[p0][0] - set_X[p2][0]) ** 2 +
                     (set_X[p0][1] - set_X[p2][1]) ** 2)
        d3 = np.sqrt((set_X[p1][0] - set_X[p2][0]) ** 2 +
                     (set_X[p1][1] - set_X[p2][1]) ** 2)
        d_min = min(d1, d2, d3)
        d_unsort = [d1 / d_min, d2 / d_min, d3 / d_min]
        triang.append(sorted(d_unsort))

    return triang


def sumTriangles(A_triang, B_triang):
    """
    For each normalized triangle in A, compare with each normalized triangle
    in B. find the differences between their sides, sum their absolute values,
    and select the two triangles with the smallest sum of absolute differences.
    """
    tr_sum, tr_idx = [], []
    for i, A_tr in enumerate(A_triang):
        for j, B_tr in enumerate(B_triang):
            # Absolute value of lengths differences.
            tr_diff = abs(np.array(A_tr) - np.array(B_tr))
            # Sum the differences
            tr_sum.append(sum(tr_diff))
            tr_idx.append([i, j])

    # Index of the triangles in A and B with the smallest sum of absolute
    # length differences.
    tr_idx_min = tr_idx[tr_sum.index(min(tr_sum))]
    A_idx, B_idx = tr_idx_min[0], tr_idx_min[1]
    print("Smallest difference: {}".format(min(tr_sum)))

    return A_idx, B_idx


# Data.
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
         [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [
    [2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
     620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
    [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
     3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]
set_B = zip(*set_B)

# All possible triangles.
A_combs = list(itertools.combinations(range(len(set_A)), 3))
B_combs = list(itertools.combinations(range(len(set_B)), 3))

# Obtain normalized triangles.
A_triang, B_triang = getTriangles(set_A, A_combs), getTriangles(set_B, B_combs)

# Index of the A and B triangles with the smallest difference.
A_idx, B_idx = sumTriangles(A_triang, B_triang)

# Indexes of points in A and B of the best match triangles.
A_idx_pts, B_idx_pts = A_combs[A_idx], B_combs[B_idx]
print 'triangle A %s matches triangle B %s' % (A_idx_pts, B_idx_pts)

# Matched points in A and B.
print "A:", [set_A[_] for _ in A_idx_pts]
print "B:", [set_B[_] for _ in B_idx_pts]

# Plot
A_pts = zip(*[set_A[_] for _ in A_idx_pts])
B_pts = zip(*[set_B[_] for _ in B_idx_pts])
plt.scatter(*A_pts, s=10, c='k')
plt.scatter(*B_pts, s=10, c='r')
plt.show()

The method is almost instantaneous and produces the correct match:

该方法几乎是瞬时的，并产生正确的匹配:

Smallest difference: 0.0314154749597
triangle A (0, 1, 4) matches triangle B (3, 4, 10)
A: [[2015.81, 1981.665], [1967.31, 1960.865], [1894.41, 1957.065]]
B: [(1051.3, 1837.3), (1929.49, 2017.52), (1580.77, 1868.33)]

2 个解决方案

#1

1) I would approach this problem by labeling all points and finding all possible combinations of 3 points from each set.

1)我将通过标记所有的点，并从每个集合中找出所有可能的3个点的组合来解决这个问题。

# normalize B data to same format as A
set_Bx, set_By = (set_B)
set_B = []
for i in range(len(set_Bx)):
    set_B.append([set_Bx[i],set_By[i]])
'''
set_B = [[2689.28, 2061.19], [3507.04, 3700.27], [2895.67, 2131.2], 
[1051.3, 1837.3], [1929.49, 2017.52], [1035.97, 80.96], [752.44, 
3524.61], [130.62, 3821.22], [620.06, 3711.53], [2769.06, 1812.12], 
[1580.77, 1868.33], [281.76, 3865.77], [224.54, 3273.77], [3848.3, 
2100.71]]
'''

list(itertools.combinations(range(len(set_A)), 3))
list(itertools.combinations(range(len(set_B)), 3))

How to generate all permutations of a list in Python

如何在Python中生成列表的所有排列

2) For each 3 point group, calculate the sides of the respective triangle; repeating the process for group A and group B.

2)每3个点组计算各自三角形的边;为A组和B组重复这个过程。

dist = sqrt( (x2 - x1)**2 + (y2 - y1)**2 )

How do I find the distance between two points?

如何求两点之间的距离?

3) Then reduce the ratio of sides for each such that each triangle's smallest side was then equal to 1; the other sides reduced in respective ratio.

3)然后减小各边的比例，使每个三角形最小的边等于1;另一方的比例相应降低。

In two similar triangles:

在两个相似三角形:

The perimeters of the two triangles are in the same ratio as the sides. The corresponding sides, medians and altitudes will all be in this same ratio.

这两个三角形的周长与边的比值相等。相应的边、中、高都是这个比例。

http://www.mathopenref.com/similartrianglesparts.html

4) Finally for each a triangle from group A, compare to each triangle from group B, with element-wise subtraction. Then sum the resulting elements and find the triangles from A and B with the smallest sum.

最后，对a组的每个三角形，与B组的每个三角形进行元素减法比较。然后将得到的元素相加，用最小的和从A和B中找到三角形。

list(numpy.array(list1)-numpy.array(list2))

Subtracting 2 lists in Python

用Python减去2个列表

5) Given matched triangles; finding the appropriate scaling, translation, and rotation should be relatively trivial in terms of CPU/RAM.

5)匹配三角形;就CPU/RAM而言，找到适当的缩放、平移和旋转应该是相对不重要的。

ETA1: rough draft script

ETA1:草稿脚本

ETA2: patched error discussed in comments: with sum(abs()) instead of abs(sum()). Now it works, fast too!

ETA2:在注释中讨论的补丁错误:用sum(abs())代替abs(sum())。现在它工作了，也很快!

'''
known correct solution

A = [[1894.41, 1957.065],[1967.31, 1960.865],[2015.81, 1981.665]]
B = [[1051.3, 1837.3],[1580.77, 1868.33],[1929.49, 2017.52]]

'''
import numpy as np
import itertools
import math
import operator

set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
        [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [[2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
        620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
        [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
        3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]

# normalize set B data to set A format
set_Bx, set_By = (set_B)
set_B = []
for i in range(len(set_Bx)):
    set_B.append([set_Bx[i],set_By[i]])

'''
set_B = [[2689.28, 2061.19], [3507.04, 3700.27], [2895.67, 2131.2], 
[1051.3, 1837.3], [1929.49, 2017.52], [1035.97, 80.96], [752.44, 3524.61], 
[130.62, 3821.22], [620.06, 3711.53], [2769.06, 1812.12], [1580.77, 1868.33], 
[281.76, 3865.77], [224.54, 3273.77], [3848.3, 2100.71]]
'''

print set_A
print set_B
print len(set_A)
print len(set_B)

set_A_tri = list(itertools.combinations(range(len(set_A)), 3))
set_B_tri = list(itertools.combinations(range(len(set_B)), 3))

print set_A_tri
print set_B_tri
print len(set_A_tri)
print len(set_B_tri)

'''
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365], 
[1964.91, 1994.565], [1894.41, 1957.065]]

set_A_tri = [(0, 1, 2), (0, 1, 3), (0, 1, 4), (0, 2, 3), (0, 2, 4), (0, 3, 4), 
(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]
'''

def distance(x1,y1,x2,y2):
    return math.sqrt((x2 - x1)**2 + (y2 - y1)**2 )

def tri_sides(set_x, set_x_tri):

    triangles = []
    for i in range(len(set_x_tri)):

        point1 = set_x_tri[i][0]
        point2 = set_x_tri[i][1]
        point3 = set_x_tri[i][2]

        point1x, point1y = set_x[point1][0], set_x[point1][1]
        point2x, point2y = set_x[point2][0], set_x[point2][1]
        point3x, point3y = set_x[point3][0], set_x[point3][1] 

        len1 = distance(point1x,point1y,point2x,point2y)
        len2 = distance(point1x,point1y,point3x,point3y)
        len3 = distance(point2x,point2y,point3x,point3y)

        min_side = min(len1,len2,len3)
        len1/=min_side
        len2/=min_side
        len3/=min_side
        t=[len1,len2,len3]
        t.sort()
        triangles.append(t)

    return triangles

A_triangles = tri_sides(set_A, set_A_tri)
B_triangles = tri_sides(set_B, set_B_tri)

print A_triangles
'''
[[1.0, 5.0405616860744304, 5.822935502560814], 
[1.0, 1.5542012854321234, 1.5619803879976761], 
[1.0, 1.3832883678507584, 2.347214708755337], 
[1.0, 1.2141910838179129, 1.4096730529373076], 
[1.0, 1.1275138587537166, 2.0318412465223665], 
[1.0, 1.5207417600732074, 2.3589630093994876], 
[1.0, 3.2270326342163584, 4.13069930678442], 
[1.0, 6.565440477766354, 6.972550347780966], 
[1.0, 2.1606693015281997, 2.3635387983160885], 
[1.0, 1.589425903498476, 1.846471085870448]]
'''
print B_triangles

def list_subtract(list1,list2):

    return np.absolute(np.array(list1)-np.array(list2))

sums = []
threshold = 1
for i in range(len(A_triangles)):
    for j in range(len(B_triangles)):
        k = sum(list_subtract(A_triangles[i], B_triangles[j]))
        if k < threshold:
            sums.append([i,j,k])
# sort by smallest sum
sums = sorted(sums, key=operator.itemgetter(2))

print sums
print 'winner %s' % sums[0]
print sums[0][0]
print sums[0][1]
match_A = set_A_tri[sums[0][0]]
match_B = set_B_tri[sums[0][1]]
print 'triangle A %s matches triangle B %s' % (match_A, match_B)

match_A_pts = []
match_B_pts = []
for i in range(3):
    match_A_pts.append(set_A[match_A[i]])
    match_B_pts.append(set_B[match_B[i]])

print 'triangle A has points %s' % match_A_pts
print 'triangle B has points %s' % match_B_pts

'''
winner [2, 204, 0.031415474959738399]
2
204
triangle A (0, 1, 4) matches triangle B (3, 4, 10)
triangle A has points [[2015.81, 1981.665], [1967.31, 1960.865], [1894.41, 1957.065]]
triangle B has points [[1051.3, 1837.3], [1929.49, 2017.52], [1580.77, 1868.33]]
'''

#2

There is an algorithm called Multi Dimensional Scaling, or MDS (http://scikit-learn.org/stable/modules/generated/sklearn.manifold.MDS.html) that finds the scale of such transforms. It's closely related to principal component analysis, but uses a vector of linear dissimilarities instead of covariances (which are a kind of squared dissimilarity).

有一种算法叫做多维缩放，或者MDS (http://scikit-learn.org/stable/modules/generated/sklearn.manifold.MDS.html)，可以找到这种转换的规模。它与主成分分析密切相关，但它使用的是线性异同向量，而不是协方差(是一种平方异同)。

To recover the rotation and offset, you can use RANSAC (http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RANSACRegressor.html).

要恢复旋转和偏移，可以使用RANSAC (http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ransacrejssor.html)。

#1