旅行推销员/车辆路径选择用例的最佳可能实现

时间:2021-12-13 15:51:29

I recently came across a case in interview when use case which was asked to be solved belongs to travelling salesman problem / vehicle routing problem. I was able to tell them what the actual problem is and what maths is involving in the problem. I did explained how below mentioned use case can also be solved using MapReduce paradigm part of Hadoop. ( explained how multiple map reduce jobs will be able to solve the problem ) using Graph algorithm mentioned in this book Data-Intensive Text Processing with MapReduce" by Jimmy Lin and Chris Dyer.

我最近在采访中遇到一个案例,当用例被要求解决属于旅行商问题/车辆路径问题。我能告诉他们实际的问题是什么,问题中涉及到什么数学。我已经解释了如何使用Hadoop的MapReduce范例部分来解决下面提到的用例。(解释了多幅地图如何减少工作将能够解决问题)使用在这本书中提到的图算法,用MapReduce进行数据密集型文本处理。

Out of curiosity I did some research on google and i can see lot of implementation and research has been done for this problem in different flavors. Problem i was asked has coordinates of city mentioned in (x,y) format and many solutions i saw on google consider some other factors like unit distance, negative/positive units of measurement and so on. So in short more i did research and reading i got more confused.

出于好奇,我对谷歌进行了一些研究,我可以看到很多针对这个问题的不同的实施和研究。我被问到的问题有(x,y)格式中提到的城市坐标,我在谷歌上看到的许多解决方案考虑了其他一些因素,如单位距离、测量的负/正单位等等。总之,我做了更多的研究和阅读,我变得更加困惑。

My question here is for below use case what can be possible solutions and what will be best solution among them. If some experienced person can put some lights on this it will be helpful to clear my confusion and understanding the solution in better way. or if someone can direct me to right direction ( so that i don't get more confused exploring whole ocean of solutions )

我的问题是下面的用例,什么是可能的解决方案,什么是最好的解决方案。如果一些有经验的人能对此有所了解,这将有助于我更好地理清我的困惑和理解解决方案。或者如果有人能指引我正确的方向(这样我就不会对探索整个解决方案的海洋感到困惑)

Use case asked in interview:

面试中问的用例:

A company is trying to find best possible optimal solution for servicing his customer base of 300 with 12 employee. They want a technology solution that tells how they will be able to meet customer requirement as business will grow and other changes like location of customer changes, new locations added and so on.

一家公司正试图找到最可能的最佳解决方案,为其300名员工和12名员工提供服务。他们想要一个技术解决方案,告诉他们如何在业务增长的同时满足客户的需求,以及其他的变化,如客户的位置变化,新增的位置等等。

Problem is basically a form of Travelling Salesman Problem ( TSP ) or Vehicle Routing problem ( VSP ). Following things need to be completed here.

问题基本上是旅行推销员问题(TSP)或车辆路径问题(VSP)的一种形式。以下内容需要在这里完成。

Starting coordinates are (0,0) and city coordinates example are mentioned below. Here are coordinates with which working solution is expected provided in a text file as input:

起始坐标为(0,0),城市坐标示例如下所示。以下是在文本文件中作为输入提供的工作解决方案的坐标:

X coordinate    Y Coordinate 
420 278 
421 40 
29  178 
350 47 
298 201 
417 186 
378 134 
447 239 
42  114 
45  199 
362 195 
381 243 
429 1 
338 209 
176 9 
364 26 
326 182 
500 129 
190 51 
489 103 
368 142 
132 260 
305 200 
446 137 
375 154 
440 190 
9   118 
437 32 
383 266 
  1. What can be right way to handle this NP-hard problem or if not right way what can be different approaches with their pros/cons.

    什么是处理这个np困难问题的正确方法,或者如果不是正确的方法,有什么方法可以用它们的优缺点。

  2. Since its more of analysis based problem can some kind of visualization be done to solve this. Like some graph or use of R/analytic tools

    由于其更多的基于分析的问题,可以通过一些可视化的方法来解决这个问题。像一些图表或R/分析工具的使用

Let me know if you need more details or if you can suggest where i can read and understand more.

如果你需要更多的细节,或者你能建议我在哪里阅读和了解更多,请告诉我。

Thanks in advance

谢谢提前

4 个解决方案

#1


3  

There's no right way of solving a NP problem. Since complexity is exponential it's going to take a very long time for anything other than trivial examples.

没有正确的方法来解决NP问题。由于复杂性是指数级的,所以除了平凡的例子外,任何事情都需要很长时间。

However, there are approximations that can get fairly close to the real answer and might be sufficiently good for your application (as in, it's not the shortest path, but it's within some relative range of it).

然而,有一些近似可以相当接近真实的答案,并且可能对您的应用程序非常有用(比如,它不是最短路径,但它在它的相对范围内)。

Check our the wikipedia page. They even have some examples.

查看我们的*页面。他们甚至有一些例子。

#2


2  

Indeed as Dmitry mentions this is a case of the multiple travelling salesman porblem. Being NP-hard naturally the interviewers are looking for you to suggest a heursitic optimisation algorithm.

的确,正如德米特里提到的,这是一个多重旅行推销员porblem的例子。面试官天生就对np困难,所以他们希望你能提出一种heursitic优化算法。

I think the key in this case is they are looking for an algorithm which is able to update in real time to changes in the number and location of destinations. Ant colony optimisaiton (a form of particle swarm optimisation) was actually initially formulated for the travelling salesman problem, see the paper and wikipedia:

我认为在这种情况下,关键是他们正在寻找一种能够实时更新到目的地数量和位置的算法。蚁群optimisaiton(粒子群优化的一种形式)最初是为旅行推销员问题而设计的,参见论文和*:

https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

"M. Dorigo, V. Maniezzo, et A. Colorni, Ant system: optimization by a colony of cooperating agents, IEEE Transactions on Systems, Man, and Cybernetics--Part B , volume 26, numéro 1, pages 29-41, 1996."

“M. Dorigo, V. Maniezzo, et a . Colorni,蚂蚁系统:一群合作代理的优化,IEEE系统、人、控制论的交易——第B部分,第26卷,编号1,第29-41页,1996。”

This has been generalised since to the multiple travelling saleman problem see for example this paper (opensource) for some nice work into it:

这已经被概括为多个旅行销售员的问题,例如本文(opensource)为它做了一些不错的工作:

http://www.researchgate.net/publication/263389346_Multi-type_ant_colony_system_for_solving_the_multiple_traveling_salesman_problem

http://www.researchgate.net/publication/263389346_Multi-type_ant_colony_system_for_solving_the_multiple_traveling_salesman_problem

In an interview situation, I would detail it has the Pros as: 1. being an efficient heuristic solution; 2. Able to update in real time to both changes in the graph; 3. For bonus points I mention that, once a reasonably efficient solution has been obtained in silico, drivers themselves could be assigned routes in a slightly probabilisitc way, subsequently optimisation driven by real data could be performed.

在面试的情况下,我会详细说明它的优点:1。是一种有效的启发式解决方案;2。能够实时更新图中的两个变化;3所示。另外,我还提到,一旦在silicon o中获得了一个合理有效的解决方案,就可以以一种略显概率的方式分配驱动程序本身的路由,然后就可以执行由真实数据驱动的优化。

Cons are that reasonably large amounts of proccessing power are likely required compared to say problems that first reduce the search space as Dmitry suggested. Secondly if they want you to actually draw up an alogirthm this could be quite challenging in the space of an interview.

反对的理由是,相对于德米特里建议的先减少搜索空间的问题而言,可能需要相当大的处理能力。第二,如果他们真的想让你写一篇很有挑战性的文章。

Interesting question :)

有趣的问题:)

#3


2  

If I would asked this question on interview - I will propose something like described in this paper, looks like best match for your task's formulation. In this paper you will find optimized approximate approach to solve multiple salesmen problem with all salesmen starting in one point. It can be adopted if we know where employees leave by solving each single travel salesman subproblem (clustering divides main problem to classic problems) with start at specific salesman's home/home office.

如果我在面试中问这个问题——我会提出一些类似本文中描述的问题,看起来最适合你的任务的提法。在本文中,您将找到一种优化的近似方法来解决所有销售人员从一点开始的多个销售人员问题。如果我们通过解决每个单独的旅行销售员子问题(聚类将主要问题划分为经典问题)来了解员工离开的地方,可以从特定的销售员的家/家办公室开始。

If we have graph of places as an input, not just coordinates - we can replace k-means with graph clustering algorithm like MCL.

如果我们有位置的图作为输入,而不仅仅是坐标,我们可以用像MCL这样的图形聚类算法来代替k-means。

#4


0  

I' am no expert but couldn't you just calculate the distance between the origin and all other points and find the nearest point, then repeat the process for that point until you have covered every point?

我不是专家,但你能不能计算出原点和其他点之间的距离找到最近的点,然后重复这个过程直到你覆盖了所有的点?

#1


3  

There's no right way of solving a NP problem. Since complexity is exponential it's going to take a very long time for anything other than trivial examples.

没有正确的方法来解决NP问题。由于复杂性是指数级的,所以除了平凡的例子外,任何事情都需要很长时间。

However, there are approximations that can get fairly close to the real answer and might be sufficiently good for your application (as in, it's not the shortest path, but it's within some relative range of it).

然而,有一些近似可以相当接近真实的答案,并且可能对您的应用程序非常有用(比如,它不是最短路径,但它在它的相对范围内)。

Check our the wikipedia page. They even have some examples.

查看我们的*页面。他们甚至有一些例子。

#2


2  

Indeed as Dmitry mentions this is a case of the multiple travelling salesman porblem. Being NP-hard naturally the interviewers are looking for you to suggest a heursitic optimisation algorithm.

的确,正如德米特里提到的,这是一个多重旅行推销员porblem的例子。面试官天生就对np困难,所以他们希望你能提出一种heursitic优化算法。

I think the key in this case is they are looking for an algorithm which is able to update in real time to changes in the number and location of destinations. Ant colony optimisaiton (a form of particle swarm optimisation) was actually initially formulated for the travelling salesman problem, see the paper and wikipedia:

我认为在这种情况下,关键是他们正在寻找一种能够实时更新到目的地数量和位置的算法。蚁群optimisaiton(粒子群优化的一种形式)最初是为旅行推销员问题而设计的,参见论文和*:

https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

https://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

"M. Dorigo, V. Maniezzo, et A. Colorni, Ant system: optimization by a colony of cooperating agents, IEEE Transactions on Systems, Man, and Cybernetics--Part B , volume 26, numéro 1, pages 29-41, 1996."

“M. Dorigo, V. Maniezzo, et a . Colorni,蚂蚁系统:一群合作代理的优化,IEEE系统、人、控制论的交易——第B部分,第26卷,编号1,第29-41页,1996。”

This has been generalised since to the multiple travelling saleman problem see for example this paper (opensource) for some nice work into it:

这已经被概括为多个旅行销售员的问题,例如本文(opensource)为它做了一些不错的工作:

http://www.researchgate.net/publication/263389346_Multi-type_ant_colony_system_for_solving_the_multiple_traveling_salesman_problem

http://www.researchgate.net/publication/263389346_Multi-type_ant_colony_system_for_solving_the_multiple_traveling_salesman_problem

In an interview situation, I would detail it has the Pros as: 1. being an efficient heuristic solution; 2. Able to update in real time to both changes in the graph; 3. For bonus points I mention that, once a reasonably efficient solution has been obtained in silico, drivers themselves could be assigned routes in a slightly probabilisitc way, subsequently optimisation driven by real data could be performed.

在面试的情况下,我会详细说明它的优点:1。是一种有效的启发式解决方案;2。能够实时更新图中的两个变化;3所示。另外,我还提到,一旦在silicon o中获得了一个合理有效的解决方案,就可以以一种略显概率的方式分配驱动程序本身的路由,然后就可以执行由真实数据驱动的优化。

Cons are that reasonably large amounts of proccessing power are likely required compared to say problems that first reduce the search space as Dmitry suggested. Secondly if they want you to actually draw up an alogirthm this could be quite challenging in the space of an interview.

反对的理由是,相对于德米特里建议的先减少搜索空间的问题而言,可能需要相当大的处理能力。第二,如果他们真的想让你写一篇很有挑战性的文章。

Interesting question :)

有趣的问题:)

#3


2  

If I would asked this question on interview - I will propose something like described in this paper, looks like best match for your task's formulation. In this paper you will find optimized approximate approach to solve multiple salesmen problem with all salesmen starting in one point. It can be adopted if we know where employees leave by solving each single travel salesman subproblem (clustering divides main problem to classic problems) with start at specific salesman's home/home office.

如果我在面试中问这个问题——我会提出一些类似本文中描述的问题,看起来最适合你的任务的提法。在本文中,您将找到一种优化的近似方法来解决所有销售人员从一点开始的多个销售人员问题。如果我们通过解决每个单独的旅行销售员子问题(聚类将主要问题划分为经典问题)来了解员工离开的地方,可以从特定的销售员的家/家办公室开始。

If we have graph of places as an input, not just coordinates - we can replace k-means with graph clustering algorithm like MCL.

如果我们有位置的图作为输入,而不仅仅是坐标,我们可以用像MCL这样的图形聚类算法来代替k-means。

#4


0  

I' am no expert but couldn't you just calculate the distance between the origin and all other points and find the nearest point, then repeat the process for that point until you have covered every point?

我不是专家,但你能不能计算出原点和其他点之间的距离找到最近的点,然后重复这个过程直到你覆盖了所有的点?