绝对值的最小和。

Problem statement:

问题陈述:

There are 3 arrays A,B,C all filled with positive integers, and all the three arrays are of the same size.

有3个数组A,B,C都是正整数，所有的三个数组都是相同大小的。

Find min(|a-b|+|b-c|+|c-a|) where a is in A, b is in B, c is in C.

找到最小值(|a-b|+|b-c|+|c-a|)， a在a, b在b, c在c。

I worked on the problem the whole weekend. A friend told me that it can be done in linear time. I don't see how that could be possible.

我整个周末都在研究这个问题。一个朋友告诉我可以在线性时间内完成。我不明白这怎么可能。

How would you do it ?

你会怎么做?

2 个解决方案

#1

Well, I think I can do it in O(n log n). I can only do O(n) if the arrays are initially sorted.

嗯，我想我可以在O(n log n)中做，如果数组最初被排序，我只能做O(n)

First, observe that you can permute a,b,c however you like without changing the value of the expression. So let x be the smallest of a,b,c; let y be the middle of the three; and let z be the maximum. Then note that the expression just equals 2*(z-x). (Edit: This is easy to see... Once you have the three numbers in order, x < y < z, the sum is just (y-x) + (z-y) + (z-x) which equals 2*(z-x))

首先，观察你可以在不改变表达式的值的情况下对a、b、c进行排序。所以x是最小的a b c;让y是三的中间部分;让z为最大值。然后注意表达式等于2*(z-x)。(编辑:这很容易看到……)一旦你有了3个数字，x < y < z，和就是(y-x) + (z-y) + (z-x) = 2*(z-x))

Thus, all we are really trying to do is find three numbers such that the outer two are as close together as possible, with the other number "sandwiched" between them.

因此，我们真正要做的是找到三个数字，这样外部的两个就尽可能地紧密相连，另一个数字“夹在中间”。

So start by sorting all three arrays in O(n log n). Maintain an index into each array; call these i, j, and k. Initialize all three to zero. Whichever index points to the smallest value, increment that index. That is, if A[i] is smaller than B[j] and C[k], increment i; if B[j] is smallest, increment j; if C[k] is smallest, increment k. Repeat, keeping track of |A[i]-B[j]| + |B[j]-C[k]| + |C[k]-A[i]| the whole time. The smallest value you observe during this march is your answer. (When the smallest of the three is at the end of its array, stop because you are done.)

At each step, you add one to exactly one index; but you can only do this n times for each array before hitting the end. So this is at most 3*n steps, which is O(n), which is less than O(n log n), meaning the total time is O(n log n). (Or just O(n) if you can assume the arrays are sorted.)

在每个步骤中，您将一个索引添加到一个索引;但你只能对每个数组执行n次，然后才能到达终点。所以这是最多3*n阶的，也就是O(n)小于O(n log n)，也就是说总时间是O(n log n)如果你可以假设数组是排序的，那么就是O(n)

Sketch of a proof that this works: Suppose A[I], B[J], C[K] are the a, b, c that form the actual answer; i.e., they have the minimum |a-b|+|b-c|+|c-a|. Suppose further that a > b > c; the proof for the other cases is symmetric.

一个证明的草图:假设a [I]， B[J]， C[K]是a, B, C，构成了实际的答案;即。他们有| - |+| - |+|c-a|。假设> b > c;其他情况的证明是对称的。

Lemma: During our march, we do not increment j past J until after we increment k past K. Proof: We always increment the index of the smallest element, and when k <= K, B[J] > C[k]. So when j=J and k <= K, B[j] is not the smallest element, so we do not increment j.

引理:在我们的进行期，我们不增加j过去的j，直到我们增加k过去k。证明:我们总是增加最小元素的索引，当k <= k, B[j] > C[k]。所以当j= j和k <= k时，B[j]不是最小的元素，所以我们不增加j。

Now suppose we increment k past K before i reaches I. What do things look like just before we perform that increment? Well, C[k] is the smallest of the three at that moment, because we are about to increment k. A[i] is less than or equal to A[I], because i < I and A is sorted. Finally, j <= J because k <= K (by our Lemma), so B[j] is also less than A[I]. Taken together, this means our sum-of-abs-diff at this moment is less than 2*(c-a), which is a contradiction.

现在假设我们在到达i之前增加k个k，那么在我们执行这个增量之前，情况是怎样的呢?C[k]是3的最小值，因为我们要增加k, A小于等于A[i]，因为i < i和A是有序的。最后，j <= j，因为k <= k(由我们的引理)，所以B[j]也小于A[I]。综上所述，这意味着我们在这一时刻的-abs-diff小于2*(c-a)，这是一个矛盾。

Thus, we do not increment k past K until i reaches I. Therefore, at some point during our march i=I and k=K. By our Lemma, at this point j is less than or equal to J. So at this point, either B[j] is less than the other two and j will get incremented; or B[j] is between the other two and our sum is just 2*(A[i]-C[k]), which is the right answer.

因此，在我到达i之前，我们不增加k的过去。因此，在我们的3月i= i和k= k的某个时刻。我们的引理，在这个点j小于等于j，所以在这一点上，B[j]小于其他2,j会增加;或者B[j]在另外两个和我们的和之间是2*(A[i]-C[k])，这是正确的答案。

This proof is sloppy; in particular, it fails to explicitly account for the case where one or more of a,b,c are equal. But I think that detail can be worked out pretty easily.

这个证明是草率;特别是，它不能明确地说明一个或多个a、b、c是相等的情况。但我认为细节可以很容易地解决。

#2

I would write a really simple program like this:

我要写一个非常简单的程序:

#!/usr/bin/python
import sys, os, random
A = random.sample(range(100), 10)
B = random.sample(range(100), 10)
C = random.sample(range(100), 10)
minsum = sys.maxint
for a in A:
 for b in B:
  for c in C:
   print 'checking with a=%d b=%d c=%d' % (a, b, c)
   abcsum = abs(a - b) + abs(b - c) + abs(c - a)
   if abcsum < minsum:
    print 'found new low sum %d with a=%d b=%d c=%d' % (abcsum, a, b, c)
    minsum = abcsum

And test it over and over until I saw some pattern emerge. The pattern I found here is what would be expected: the numbers that are closest together in each set, regardless of whether the numbers are "high" or "low", are those that produce the smallest minimum sum. So it becomes a nearest-number problem. For whatever that's worth, probably not much.

然后反复测试，直到我看到一些模式出现。我在这里找到的模式是期望的:在每个集合中最接近的数字，不管这些数字是“高”还是“低”，都是那些产生最小最小和的值。所以它变成了一个最接近的问题。不管它的价值是多少，也许并不多。

#1