两个整数的唯一对[重复]

时间:2023-02-07 12:48:07

Possible Duplicate:
Mapping two integers to one, in a unique and deterministic way

可能的重复:以唯一和确定的方式将两个整数映射为一个

I'm trying to create unique identificator for pair of two integers (Ruby) :

我正在为两个整数(Ruby)创建唯一标识符:

f(i1,i2) = f(i2, i1) = some_unique_value

So, i1+i2, i1*i2, i1^i2 -not unique as well as (i1>i2) ? "i1" + "i2" : "i2" + "i1".

i1和i2,i1 * i2 i1和i2——独特以及^(i1和i2)> ?“i1”+“i2”:“i2”+“i1”。

I think following solution will be ok:

我认为以下的解决方案可以:

(i1>i2) ? "i1" + "_" + "i2" : "i2" + "_" + "i1"

but:

但是:

  1. I have to save result in DB and index it. So I prefer it to be an integer and as small as it possible.
  2. 我必须将结果保存在DB中并进行索引。所以我希望它是一个尽可能小的整数。
  3. Is Zlib.crc32(f(i1,i2)) can guaranty uniqueness?
  4. Zlib.crc32(f(i1,i2))是否可以保证唯一性?

Thanks.

谢谢。

UPD:

乌利希期刊指南:

Actually, I'm not sure the result MUST be integer. Maybe I can convert it to decimal: (i1>i2) ? i1.i2 : i2.i1

实际上,我不确定结果一定是整数。或许我可以把它转换成十进制(i1>i2) ?i1。i2:i2.i1

?

吗?

5 个解决方案

#1


5  

What you're looking for is called a Pairing function.

你要找的是一个配对函数。

The following illustration from the German wikipedia page clearly shows how it works:

下面来自德国*页面的插图清楚地显示了它是如何工作的:

两个整数的唯一对[重复]

Implemented in Ruby:

在Ruby中实现:

def cantor_pairing(n, m)
    (n + m) * (n + m + 1) / 2 + m
end

(0..5).map do |n|
  (0..5).map do |m|
    cantor_pairing(n, m)
  end
end
=> [[ 0,  2,  5,  9, 14, 20],
    [ 1,  4,  8, 13, 19, 26],
    [ 3,  7, 12, 18, 25, 33],
    [ 6, 11, 17, 24, 32, 41],
    [10, 16, 23, 31, 40, 50],
    [15, 22, 30, 39, 49, 60]]

Note that you will need to store the result of this pairing in a datatype with as many bits as both your input numbers put together. (If both input numbers are 32-bit, you will need a 64-bit datatype to be able to store all possible combinations, obviously.)

请注意,您将需要将这种配对的结果存储在一个数据类型中,该数据类型的位数与您的输入数字加在一起的位数相同。(如果两个输入数字都是32位,则需要64位数据类型才能存储所有可能的组合。)

#2


2  

No, Zlib.crc32(f(i1,i2)) is not unique for all integer values of i1 and i2.

不,Zlib.crc32(f(i1,i2))对于i1和i2的所有整数值不是唯一的。

If i1 and i2 are also 32bit numbers then there are many more combinations of them than can be stored in a 32bit number, which is returned by CRC32.

如果i1和i2也是32位的数字,那么它们的组合要比存储在32位的数字(CRC32返回)中多得多。

#3


1  

CRC32 is not unique, and wouldn't be good to use as a key. Assuming you know the maximum value of your integers i1 and i2:

CRC32并不是唯一的,也不适合用作键。假设已知整数i1和i2的最大值:

unique_id = (max_i2+1)*i1 + i2

If your integers can be negative, or will never be below a certain positive integer, you'll need the max and min values:

如果你的整数可以是负的,或者永远不会低于某个正整数,你将需要最大值和最小值:

(max_i2-min_i2+1) * (i1-min_i1) + (i2-min_i2)

This will give you the absolute smallest number possible to identify both integers.

这将给你确定两个整数的最小的数。

#4


1  

Well, no 4-byte hash will be unique when its input is an arbitrary binary string of more than 4 bytes. Your strings are from a highly restricted symbol set, so collisions will be fewer, but "no, not unique".

当它的输入是一个大于4字节的任意二进制字符串时,4字节的哈希不会是唯一的。您的字符串来自高度受限的符号集,因此冲突会更少,但是“不,不是唯一的”。

There are two ways to use a smaller integer than the possible range of values for both of your integers:

对于两个整数,有两种方法可以使用比可能的值范围更小的整数:

  1. Have a system that works despite occasional collisions
  2. 是否有一个系统,尽管偶尔发生碰撞?
  3. Check for collisions and use some sort of rehash
  4. 检查冲突并使用某种重新哈希

The obvious way to solve your problem with a 1:1 mapping requires that you know the maximum value of one of the integers. Just multiply one by the maximum value and add the other, or determine a power of two ceiling, shift one value accordingly, then OR in the other. Either way, every bit is reserved for one or the other of the integers. This may or may not meet your "as small as possible" requirement.

用1:1映射来解决问题的明显方法是,您需要知道其中一个整数的最大值。只需将一个值与最大值相乘,再加上另一个值,或者确定两个上限的幂,相应地移动一个值,然后在另一个值中移动。无论哪种方式,每个位都保留给一个或另一个整数。这可能满足你的“尽可能小的”要求,也可能不满足。

Your ###_### string is unique per pair; if you could just store that as a string you win.

您的### ### #字符串每对都是唯一的;如果你可以把它存储为一个字符串你就赢了。

#5


0  

Here's a better, more space efficient solution:. My answer on it here

这里有一个更好、更有效的解决方案:。我的答案在这里。

#1


5  

What you're looking for is called a Pairing function.

你要找的是一个配对函数。

The following illustration from the German wikipedia page clearly shows how it works:

下面来自德国*页面的插图清楚地显示了它是如何工作的:

两个整数的唯一对[重复]

Implemented in Ruby:

在Ruby中实现:

def cantor_pairing(n, m)
    (n + m) * (n + m + 1) / 2 + m
end

(0..5).map do |n|
  (0..5).map do |m|
    cantor_pairing(n, m)
  end
end
=> [[ 0,  2,  5,  9, 14, 20],
    [ 1,  4,  8, 13, 19, 26],
    [ 3,  7, 12, 18, 25, 33],
    [ 6, 11, 17, 24, 32, 41],
    [10, 16, 23, 31, 40, 50],
    [15, 22, 30, 39, 49, 60]]

Note that you will need to store the result of this pairing in a datatype with as many bits as both your input numbers put together. (If both input numbers are 32-bit, you will need a 64-bit datatype to be able to store all possible combinations, obviously.)

请注意,您将需要将这种配对的结果存储在一个数据类型中,该数据类型的位数与您的输入数字加在一起的位数相同。(如果两个输入数字都是32位,则需要64位数据类型才能存储所有可能的组合。)

#2


2  

No, Zlib.crc32(f(i1,i2)) is not unique for all integer values of i1 and i2.

不,Zlib.crc32(f(i1,i2))对于i1和i2的所有整数值不是唯一的。

If i1 and i2 are also 32bit numbers then there are many more combinations of them than can be stored in a 32bit number, which is returned by CRC32.

如果i1和i2也是32位的数字,那么它们的组合要比存储在32位的数字(CRC32返回)中多得多。

#3


1  

CRC32 is not unique, and wouldn't be good to use as a key. Assuming you know the maximum value of your integers i1 and i2:

CRC32并不是唯一的,也不适合用作键。假设已知整数i1和i2的最大值:

unique_id = (max_i2+1)*i1 + i2

If your integers can be negative, or will never be below a certain positive integer, you'll need the max and min values:

如果你的整数可以是负的,或者永远不会低于某个正整数,你将需要最大值和最小值:

(max_i2-min_i2+1) * (i1-min_i1) + (i2-min_i2)

This will give you the absolute smallest number possible to identify both integers.

这将给你确定两个整数的最小的数。

#4


1  

Well, no 4-byte hash will be unique when its input is an arbitrary binary string of more than 4 bytes. Your strings are from a highly restricted symbol set, so collisions will be fewer, but "no, not unique".

当它的输入是一个大于4字节的任意二进制字符串时,4字节的哈希不会是唯一的。您的字符串来自高度受限的符号集,因此冲突会更少,但是“不,不是唯一的”。

There are two ways to use a smaller integer than the possible range of values for both of your integers:

对于两个整数,有两种方法可以使用比可能的值范围更小的整数:

  1. Have a system that works despite occasional collisions
  2. 是否有一个系统,尽管偶尔发生碰撞?
  3. Check for collisions and use some sort of rehash
  4. 检查冲突并使用某种重新哈希

The obvious way to solve your problem with a 1:1 mapping requires that you know the maximum value of one of the integers. Just multiply one by the maximum value and add the other, or determine a power of two ceiling, shift one value accordingly, then OR in the other. Either way, every bit is reserved for one or the other of the integers. This may or may not meet your "as small as possible" requirement.

用1:1映射来解决问题的明显方法是,您需要知道其中一个整数的最大值。只需将一个值与最大值相乘,再加上另一个值,或者确定两个上限的幂,相应地移动一个值,然后在另一个值中移动。无论哪种方式,每个位都保留给一个或另一个整数。这可能满足你的“尽可能小的”要求,也可能不满足。

Your ###_### string is unique per pair; if you could just store that as a string you win.

您的### ### #字符串每对都是唯一的;如果你可以把它存储为一个字符串你就赢了。

#5


0  

Here's a better, more space efficient solution:. My answer on it here

这里有一个更好、更有效的解决方案:。我的答案在这里。