I am implementing a Flame clustering algorithm as a way of learning a bit more about graphs and graph traversal, and one of the first steps is constructing a K-nearest-neighbors graph, and I'm wondering what the fastest way would be of running through a list of nodes and connecting each one only to say, it's nearest five neighbors. My thought was that I would start at a node, iterate through the list of other nodes and keep the ones that are closest within an array, making sure that everything past the top n are discarded. Now, I could do this by just sorting a list and keeping the top n entries, but I would much rather keep less fewer things in memory and so I was wondering if there was a way to just have the final array and update that array as I iterate through, or if there is a more efficient way of generating a k nearest neighbors graph.
我实现火焰聚类算法作为一种了解更多图表和图遍历的第一个步骤是构造一个K-nearest-neighbors图,我想知道最快的方法将贯穿一组节点和连接每一个只说,这是最近的五个邻居。我的想法是,我将从一个节点开始,遍历其他节点的列表,并保留数组中最接近的节点,确保所有经过前n的节点都被丢弃。现在,我可以通过排序并保持前n项列表,但我宁愿少少东西保留在内存中,所以我想知道如果有一种方法只有最后一个数组和更新数组遍历,或者如果有一个更有效的方法生成一个k最近的邻居图。
Also, please note, this is NOT a duplicate of K-Nearest Neighbour Implementation in Java. KNNG is distinct from KNN.
另外,请注意,这不是Java最近的近邻实现的副本。KNNG与KNN不同。
2 个解决方案
#1
1
Place the first n nodes, sorted in a List. Then iterate through the rest of nodes and if it fits in the current list (i.e. is a top n node), place it in the corresponding position in the list and discard the last top n node. If it doesn't fit in the top n list, discard it.
放置第一个n个节点,排序在一个列表中。然后遍历其余的节点,如果它符合当前列表(即一个前n个节点),将其放置在列表中相应的位置,并丢弃最后一个顶部n节点。如果它不符合最上面的n个列表,就放弃它。
for each neighborNode
for(int i = 0; i < topNList.size(); i++){
if((dist = distanceMetric(neighborNode,currentNode)) > topNList.get(i).distance){
topNList.remove(topNList.size()-1)
neighborNode.setDistance(dist);
topNList.add(i, neighborNode);
}
#2
1
I think the most efficient way would be using a bound priority queue, like https://github.com/tdebatty/java-graphs#bounded-priority-queue
我认为最有效的方法是使用绑定的优先队列,比如https://github.com/tdebatty/java-graphs# boun- priorityqueue。
#1
1
Place the first n nodes, sorted in a List. Then iterate through the rest of nodes and if it fits in the current list (i.e. is a top n node), place it in the corresponding position in the list and discard the last top n node. If it doesn't fit in the top n list, discard it.
放置第一个n个节点,排序在一个列表中。然后遍历其余的节点,如果它符合当前列表(即一个前n个节点),将其放置在列表中相应的位置,并丢弃最后一个顶部n节点。如果它不符合最上面的n个列表,就放弃它。
for each neighborNode
for(int i = 0; i < topNList.size(); i++){
if((dist = distanceMetric(neighborNode,currentNode)) > topNList.get(i).distance){
topNList.remove(topNList.size()-1)
neighborNode.setDistance(dist);
topNList.add(i, neighborNode);
}
#2
1
I think the most efficient way would be using a bound priority queue, like https://github.com/tdebatty/java-graphs#bounded-priority-queue
我认为最有效的方法是使用绑定的优先队列,比如https://github.com/tdebatty/java-graphs# boun- priorityqueue。