Source: AMAZON INTERVIEW QUESTION
来源:亚马逊面试问题
Given a point P and other N points in two dimensional space, find K points out of the N points which are nearest to P.
在二维空间中给定一个点P和其他N点,找出最接近P的N点的K点。
What is the most optimal way to do this ?
最理想的方法是什么?
This Wiki page does not provide much of help in building a algorithm.Any ideas/approaches people.
这个Wiki页面在构建算法时没有提供太多帮助。任何想法/方法的人。
5 个解决方案
#1
26
Solution 1 make heap of size K and collect points by minimal distance O(NLogK) complexity.
解决方案1通过最小距离O(NLogK)复杂度进行堆大小K和收集点。
Solution 2: Take and array of size N and Sort by distance. Should be used QuickSort (Hoare modification). As answer take first K points. This is too NlogN complexity but it is possible optimize to approximate O(N). If skip sorting of unnecessary sub arrays. When you split array by 2 sub arrays you should take only array where Kth index located. complexity will be : N +N/2 +N/4 + ... = O(N).
解决方案2:取大小为N的数组,按距离排序。应该使用快速排序(Hoare修改)。作为答案取第K个点。这是太NlogN的复杂性,但它是可能的优化,以接近O(N)。如果跳过不必要的子数组的排序。当你用两个子数组分割数组时,你应该只使用第k个索引所在的数组。复杂度将是:N +N/2 +N/4 +…= O(N)。
Solution 3: search Kth element in result array and takes all point lesser then founded. Exists O(N) alghoritm, similar to search of median.
解决方案3:在结果数组中搜索第k个元素,并将所有的点降低。存在O(N) alghoritm,类似于中位数搜索。
Notes: better use sqr of distance to avoid of sqrt operations, it will be greater faster if point has integer coordinates.
注意:最好使用距离的sqr来避免sqrt操作,如果点有整数坐标,则会更快。
As interview answer better use Solution 2 or 3.
作为面试回答,最好使用解决方案2或3。
#2
6
For just a single query...
只需要一个查询…
Maintain a heap of size k
.
维护一个大小为k的堆。
For each point, calculate the distance to the point P
. Insert that distance into the heap and delete the maximum from the heap if the size of the heap is greater than k
.
对于每一个点,计算到点p的距离,如果堆的大小大于k,将距离插入到堆中并从堆中删除最大值。
Running time: O(n log k)
运行时间:O(n log k)
#3
3
You could use KD tree http://en.wikipedia.org/wiki/K-d_tree to partition space and given the point you will be able to gradually search for neighbors using binary search. Benefit of using this approach is that it easily scales up to online version when you receive points/queries in runtime one by one or in batches.
您可以使用KD树http://en.wikipedia.org/wiki/K-d_tree来划分空间,并且在此基础上,您将能够使用二进制搜索逐步搜索邻居。使用这种方法的好处是,当你在运行时一个接一个地收到点/查询时,它很容易扩展到在线版本。
#4
1
Solution 1
解决方案1
private List<Point> nearestKPoint_1(List<Point> list, final Point center, int k) {
List<Point> ans = new ArrayList<>();
PriorityQueue<Point> maxHeap = new PriorityQueue<>(k + 1, new Comparator<Point>() {
@Override
public int compare(Point o1, Point o2) {
return distance(center, o2) - distance(center, o1);
}
});
for (Point p : list) {
maxHeap.offer(p);
if (maxHeap.size() > k) {
maxHeap.poll();
}
}
Iterator<Point> i = maxHeap.iterator();
while (i.hasNext()) {
ans.add(i.next());
}
return ans;
}
public int distance(Point p1, Point p2) {
return (p1.x - p2.x) * (p1.x - p2.x) + (p1.y - p2.y) * (p1.y - p2.y);
}
static class Point {
int x;
int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Point point = (Point) o;
if (x != point.x) return false;
return y == point.y;
}
@Override
public int hashCode() {
int result = x;
result = 31 * result + y;
return result;
}
}
Solution 2
解决方案2
private List<Point> nearestKPoint_2(List<Point> list, final Point center, int k) {
List<Point> ans = new ArrayList<>();
Distance[] nums = new Distance[list.size()];
for (int i = 0; i < nums.length; i++) {
nums[i] = new Distance(distance(center, list.get(i)), i);
}
quickSelect(nums, k);
for (int i = 0; i < k; i++) {
ans.add(list.get(nums[i].i));
}
return ans;
}
private void quickSelect(Distance[] nums, int k) {
int start = 0, end = nums.length - 1;
while (start < end) {
int p = partition(nums, start, end);
if (p == k) {
return;
} else if (p < k) {
start = p + 1;
} else {
end = p - 1;
}
}
}
private int partition(Distance[] nums, int start, int end) {
Distance pivot = nums[start];
int i = start, j = end + 1;
while (true) {
while (i < end && nums[++i].compareTo(pivot) < 0);
while (j > start && nums[--j].compareTo(pivot) > 0);
if (i >= j) {
break;
}
swap(nums, i, j);
}
swap(nums, start, j);
return j;
}
private void swap(Distance[] nums, int i, int j) {
Distance tmp = nums[i];
nums[i] = nums[j];
nums[j] = tmp;
}
class Distance implements Comparable<Distance> {
int d;
int i;
public Distance(int d, int i) {
this.d = d;
this.i = i;
}
@Override
public int compareTo(Distance o) {
return this.d - o.d;
}
}
#5
-1
What is wrong with below approach ?
下面的方法有什么问题?
1) Calculate distance from given point to other points.
1)计算给定点到其他点的距离。
2) Store the distance and index of that point to TreeMap<Double,Integer> map
2)将此点的距离和索引存储到TreeMap
3) Select top K elements from map. It's values will give index of Point element from points array.
3)从map中选择top K元素。它的值将给出点数组的点元素的索引。
The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time,
地图按其键的自然顺序排列,或按地图创建时间提供的比较器进行排序,
#1
26
Solution 1 make heap of size K and collect points by minimal distance O(NLogK) complexity.
解决方案1通过最小距离O(NLogK)复杂度进行堆大小K和收集点。
Solution 2: Take and array of size N and Sort by distance. Should be used QuickSort (Hoare modification). As answer take first K points. This is too NlogN complexity but it is possible optimize to approximate O(N). If skip sorting of unnecessary sub arrays. When you split array by 2 sub arrays you should take only array where Kth index located. complexity will be : N +N/2 +N/4 + ... = O(N).
解决方案2:取大小为N的数组,按距离排序。应该使用快速排序(Hoare修改)。作为答案取第K个点。这是太NlogN的复杂性,但它是可能的优化,以接近O(N)。如果跳过不必要的子数组的排序。当你用两个子数组分割数组时,你应该只使用第k个索引所在的数组。复杂度将是:N +N/2 +N/4 +…= O(N)。
Solution 3: search Kth element in result array and takes all point lesser then founded. Exists O(N) alghoritm, similar to search of median.
解决方案3:在结果数组中搜索第k个元素,并将所有的点降低。存在O(N) alghoritm,类似于中位数搜索。
Notes: better use sqr of distance to avoid of sqrt operations, it will be greater faster if point has integer coordinates.
注意:最好使用距离的sqr来避免sqrt操作,如果点有整数坐标,则会更快。
As interview answer better use Solution 2 or 3.
作为面试回答,最好使用解决方案2或3。
#2
6
For just a single query...
只需要一个查询…
Maintain a heap of size k
.
维护一个大小为k的堆。
For each point, calculate the distance to the point P
. Insert that distance into the heap and delete the maximum from the heap if the size of the heap is greater than k
.
对于每一个点,计算到点p的距离,如果堆的大小大于k,将距离插入到堆中并从堆中删除最大值。
Running time: O(n log k)
运行时间:O(n log k)
#3
3
You could use KD tree http://en.wikipedia.org/wiki/K-d_tree to partition space and given the point you will be able to gradually search for neighbors using binary search. Benefit of using this approach is that it easily scales up to online version when you receive points/queries in runtime one by one or in batches.
您可以使用KD树http://en.wikipedia.org/wiki/K-d_tree来划分空间,并且在此基础上,您将能够使用二进制搜索逐步搜索邻居。使用这种方法的好处是,当你在运行时一个接一个地收到点/查询时,它很容易扩展到在线版本。
#4
1
Solution 1
解决方案1
private List<Point> nearestKPoint_1(List<Point> list, final Point center, int k) {
List<Point> ans = new ArrayList<>();
PriorityQueue<Point> maxHeap = new PriorityQueue<>(k + 1, new Comparator<Point>() {
@Override
public int compare(Point o1, Point o2) {
return distance(center, o2) - distance(center, o1);
}
});
for (Point p : list) {
maxHeap.offer(p);
if (maxHeap.size() > k) {
maxHeap.poll();
}
}
Iterator<Point> i = maxHeap.iterator();
while (i.hasNext()) {
ans.add(i.next());
}
return ans;
}
public int distance(Point p1, Point p2) {
return (p1.x - p2.x) * (p1.x - p2.x) + (p1.y - p2.y) * (p1.y - p2.y);
}
static class Point {
int x;
int y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Point point = (Point) o;
if (x != point.x) return false;
return y == point.y;
}
@Override
public int hashCode() {
int result = x;
result = 31 * result + y;
return result;
}
}
Solution 2
解决方案2
private List<Point> nearestKPoint_2(List<Point> list, final Point center, int k) {
List<Point> ans = new ArrayList<>();
Distance[] nums = new Distance[list.size()];
for (int i = 0; i < nums.length; i++) {
nums[i] = new Distance(distance(center, list.get(i)), i);
}
quickSelect(nums, k);
for (int i = 0; i < k; i++) {
ans.add(list.get(nums[i].i));
}
return ans;
}
private void quickSelect(Distance[] nums, int k) {
int start = 0, end = nums.length - 1;
while (start < end) {
int p = partition(nums, start, end);
if (p == k) {
return;
} else if (p < k) {
start = p + 1;
} else {
end = p - 1;
}
}
}
private int partition(Distance[] nums, int start, int end) {
Distance pivot = nums[start];
int i = start, j = end + 1;
while (true) {
while (i < end && nums[++i].compareTo(pivot) < 0);
while (j > start && nums[--j].compareTo(pivot) > 0);
if (i >= j) {
break;
}
swap(nums, i, j);
}
swap(nums, start, j);
return j;
}
private void swap(Distance[] nums, int i, int j) {
Distance tmp = nums[i];
nums[i] = nums[j];
nums[j] = tmp;
}
class Distance implements Comparable<Distance> {
int d;
int i;
public Distance(int d, int i) {
this.d = d;
this.i = i;
}
@Override
public int compareTo(Distance o) {
return this.d - o.d;
}
}
#5
-1
What is wrong with below approach ?
下面的方法有什么问题?
1) Calculate distance from given point to other points.
1)计算给定点到其他点的距离。
2) Store the distance and index of that point to TreeMap<Double,Integer> map
2)将此点的距离和索引存储到TreeMap
3) Select top K elements from map. It's values will give index of Point element from points array.
3)从map中选择top K元素。它的值将给出点数组的点元素的索引。
The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time,
地图按其键的自然顺序排列,或按地图创建时间提供的比较器进行排序,