Excel 2007 -生成基于文本的唯一ID ?

时间:2022-11-25 11:30:47

I have a sheet with a list of names in Column B and an ID column in A. I was wondering if there is some kind of formula that can take the value in column B of that row and generate some kind of ID based on the text? Each name is also unique and is never repeated in any way.

我有一张表,列有B列的名字和a的ID列,我想知道是否有某种公式可以取行B中的值,并根据文本生成某种ID ?每一个名字都是独一无二的,从来不会以任何方式重复。

It would be best if I didn't have to use VBA really. But if I have to, so be it.

如果我不需要使用VBA就更好了。但如果我必须这么做,那就随它去吧。

6 个解决方案

#1


3  

Sorry, I didn't found a solution with formula only even if this thread might help (trying to calculate the points in a scrabble game) but I didn't find a way to be sure the generated hash would be unique.

抱歉,我没有找到一个只有公式的解决方案,即使这个线程可能会有帮助(尝试计算scrabble游戏中的点),但是我没有找到一种方法来确保生成的散列是唯一的。

Yet, here is my solution, based on a UDF (Used-Defined Function):

然而,我的解决方案是基于UDF (used定义的函数):

Put the code in a module:

将代码放入模块:

Public Function genId(ByVal sName As String) As Long
'Function to create a unique hash by summing the ascii value of each character of a given string
    Dim sLetter As String
    Dim i As Integer
    For i = 1 To Len(sName)
        genId = Asc(Mid(sName, i, 1)) * i + genId
    Next i
End Function

And call it in your worksheet like a formula:

并在你的工作表中像一个公式:

=genId(A1)

[EDIT] Added the * i to take into account the order. It works on my unit tests

[编辑]添加* i以考虑订单。它适用于我的单元测试

#2


2  

Solution Without VBA.

没有VBA的解决方案。

Logic based on First 8 characters + number of character in a cell.

逻辑基于前8个字符+一个单元中的字符数。

= CODE(cell) which returns Code number for first letter

=代码(单元),返回第一个字母的代码号。

= CODE(MID(cell,2,1)) returns Code number for second letter

=代码(MID(cell,2,1))返回第二个字母的代码号。

= IFERROR(CODE(MID(cell,9,1)) If 9th character does not exist then return 0

= IFERROR(代码(MID(cell,9,1)))如果第9个字符不存在,则返回0

= LEN(cell) number of character in a cell

= LEN(单元格)单元格中的字符数

Concatenating firs 8 codes + adding length of character on the end

连接firs 8代码+在末尾添加字符长度。

If 8 character is not enough, then replicate additional codes for next characters in a string.

如果8个字符还不够,则为字符串中的下一个字符复制其他代码。

Final function:

最后的函数:

=CODE(B2)&IFERROR(CODE(MID(B2,2,1)),0)&IFERROR(CODE(MID(B2,3,1)),0)&IFERROR(CODE(MID(B2,4,1)),0)&IFERROR(CODE(MID(B2,5,1)),0)&IFERROR(CODE(MID(B2,6,1)),0)&IFERROR(CODE(MID(B2,7,1)),0)&IFERROR(CODE(MID(B2,8,1)),0)&LEN(B2)

Excel 2007 -生成基于文本的唯一ID ?

#3


0  

May be OTT for your needs, but you can use a call to CoCreateGuid to get a real GUID

你的需要可能是OTT,但是你可以用CoCreateGuid来得到一个真正的GUID吗

Private Declare Function CoCreateGuid Lib "ole32" (ID As Any) As Long

Function GUID() As String
    Dim ID(0 To 15) As Byte
    Dim i As Long

    If CoCreateGuid(ID(0)) = 0 Then
        For i = 0 To 15
            GUID = GUID & Format(Hex$(ID(i)), "00")
        Next
    Else
        GUID = "Error while creating GUID!"
    End If

End Function

Test using

测试使用

Sub testGUID()
    MsgBox GUID
End Sub

How to best implement depends on your needs. One way would be to write a macro to get a GUID populate a column where names exist. (note, using it as a udf as is is no good, since it will return a new GUID when recalculated)

如何最好地实现取决于您的需要。一种方法是编写一个宏来获得一个GUID填充一个列中名称的存在。(注意,使用它作为udf是不好的,因为在重新计算时它将返回一个新的GUID)

EDIT
See this answer for creating a SHA1 hash of a string

编辑查看创建一个字符串的SHA1散列的答案

#4


0  

Do you just want an incrementing numeric id column to sit next to your values? If so, and if your values will always be unique, you can very easily do this with formulae.

您是否只想要一个递增的数字id列,以便与您的值相邻?如果是这样,如果你的值总是唯一的,你可以很容易地用公式来做。

If your values were in column B, starting in B2 underneath your headers for example, in A2 you would type the formula "=IF(B2="","",1+MAX(A$1:A1))". You can copy and paste that down as far as your data extends, and it will increment a numeric identifier for each row in column B which isn't blank.

如果你的值在B列,例如,在A2页眉下方的B2中,你可以输入公式“= If (B2="" "" "" " 1+MAX(A$1:A1) "您可以在数据扩展时复制并粘贴它,并且它将为列B中的每一行增加一个数字标识符,这不是空白的。

If you need to do anything more complicated, like identify and re-identify repeating values, or make identifiers 'freeze' once they're populated, let me know. Currently, when you clear or add values to your list the identifers will toggle themselves up and down, so you need to be careful if your data changes.

如果您需要做一些更复杂的事情,比如识别和重新标识重复的值,或者在标识符被填充后使它们“冻结”,请让我知道。目前,当您清除或向列表中添加值时,标识符会上下切换,所以如果您的数据发生更改,您需要小心。

#5


0  

Unique identifier based on the number of specific characters in text. I used an identifier based on vowels and numbers.

基于文本中特定字符数的唯一标识符。我使用了基于元音和数字的标识符。

=LEN($J$14)-LEN(SUBSTITUTE($J$14;"a";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"e";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"i";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"j";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"o";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"u";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"y";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"1";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"2";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"3";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"4";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"5";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"6";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"7";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"8";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"9";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"0";""))

#6


0  

You say you are confident that there are no duplicate values in your words. To push it further, are you confident that the first 8 characters in any word would be unique?

你说你有信心你的话语中没有重复的价值。为了进一步推广,你是否相信任何单词的前8个字符都是唯一的?

If so, you can use the below formula. It works by individually taking each character's ASCII code - 40 [assuming normal characters, this puts numbers at between 8 & 57, and letters at between 57 & 122], and multiplying that characters code by 10 ^ [that character's digit placement in the word]. Basically it takes that character code [-40], and concatenates each code onto the next.

如果是,你可以使用下面的公式。它通过单独每个字符的ASCII代码- 40(假设正常的角色,这使得数字8 & 57岁和字母之间57 & 122],和字符代码乘以10 ^(字符数字位置的单词)。基本上,它使用字符代码[-40],并将每个代码连接到下一个。

EDIT Note that this code no longer requires that at least 8 characters exist in your word to prevent an error, as the actual word to be coded has 8 "0"'s appended to it.

编辑注意,该代码不再要求您的word中至少存在8个字符,以防止出现错误,因为要编码的实际单词有8个“0”的后缀。

=TEXT(SUM((CODE(MID(LOWER(RIGHT(REPT("0",8)&A3,8)),{1,2,3,4,5,6,7,8},1))-40)*10^{0,2,4,6,8,10,12,14}),"#")

Note that as this uses the ASCII values of the characters, the ID # could be used to identify the name directly - this does not really create anonymity, it just turns 8 unique characters into a unique number. It is obfuscated with the -40, but not really 'safe' in that sense. The -40 is just to get normal letters and numbers in the 2 digit range, so that multiplying by 10^0,2,4 etc. will create a 2 digit unique add-on to the created code.

注意,由于这使用了字符的ASCII值,ID #可以直接用于标识名称——这并不是真正创建匿名,它只是将8个唯一字符转换为唯一数字。它与-40混淆,但在这个意义上并不是真正的“安全”。-40是正常的2位数的字母和数字,乘以10 ^ 0,2、4等将创建一个2位独特的附加组件创建的代码。

EDIT FOR ALTERNATIVE METHOD

编辑对替代方法

I had previously attempted to do this so that it would look at each letter of the alphabet, count the number of times it appears in the word, and then multiply that by 10*[that letter's position in the alphabet]. The problem with doing this (see comment below for formula) is that it required a number of 10^26-1, which is beyond Excel's floating point precision. However, I have a modified version of that method:

我之前尝试过这样做,这样它就能看到字母表中的每个字母,数出它在单词中出现的次数,然后再乘以10*(那个字母在字母表中的位置)。这样做的问题(参见下面的评论公式)是它需要大量的10 ^银行业,这是超出了Excel的浮点精度。但是,我有一个修改后的方法:

By limiting the number of allowed characters in the alphabet, we can get the max total size possible to 10^15-1, which Excel can properly calculate. The formula looks like this:

通过限制允许字母字符的数量,我们可以得到的最大总大小可能10 ^ 15 - 1,Excel可以正确计算。公式是这样的:

=RIGHT(REPT("0",15)&TEXT(SUM(LEN(A3)*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}-LEN(SUBSTITUTE(A3,MID(Alphabet,{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15},1),""))*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}),"#"),15)

[The RIGHT("00000000000000"... portion of the formula is meant to keep all codes the same number of characters]

(正确的(“00000000000000”……这个公式的一部分是为了使所有的代码保持相同的字符数]

Note that here, Alphabet is a named string which holds the characters: "abcdehilmnorstu". For example, using the above formula, the word "asdf" counts the instances of a, s, and d, but not 'f' which isn't in my contracted alphabet. The code of "asdf" would be:

请注意,这里的Alphabet是一个命名字符串,它包含以下字符:“abcdehilmnorstu”。例如,使用上面的公式,单词“asdf”计算a、s和d的实例,而不是“f”,它不在我的缩写字母表中。“asdf”的代码是:

001000000001001

001000000001001

This only works with the following assumptions:

这只适用于以下假设:

The letters not listed (nor numbers / special characters) are not required to make each name unique. For example, asdf & asd would have the same code in the above method.

未列出的字母(或数字/特殊字符)不需要使每个名称都是唯一的。例如,asdf和asd在上述方法中有相同的代码。

And,

而且,

The order of the letters is not required to make each name unique. For example, asd & dsa would have the same code in the above method.

字母的顺序并不要求每个名字都是唯一的。例如,asd和dsa在上述方法中有相同的代码。

#1


3  

Sorry, I didn't found a solution with formula only even if this thread might help (trying to calculate the points in a scrabble game) but I didn't find a way to be sure the generated hash would be unique.

抱歉,我没有找到一个只有公式的解决方案,即使这个线程可能会有帮助(尝试计算scrabble游戏中的点),但是我没有找到一种方法来确保生成的散列是唯一的。

Yet, here is my solution, based on a UDF (Used-Defined Function):

然而,我的解决方案是基于UDF (used定义的函数):

Put the code in a module:

将代码放入模块:

Public Function genId(ByVal sName As String) As Long
'Function to create a unique hash by summing the ascii value of each character of a given string
    Dim sLetter As String
    Dim i As Integer
    For i = 1 To Len(sName)
        genId = Asc(Mid(sName, i, 1)) * i + genId
    Next i
End Function

And call it in your worksheet like a formula:

并在你的工作表中像一个公式:

=genId(A1)

[EDIT] Added the * i to take into account the order. It works on my unit tests

[编辑]添加* i以考虑订单。它适用于我的单元测试

#2


2  

Solution Without VBA.

没有VBA的解决方案。

Logic based on First 8 characters + number of character in a cell.

逻辑基于前8个字符+一个单元中的字符数。

= CODE(cell) which returns Code number for first letter

=代码(单元),返回第一个字母的代码号。

= CODE(MID(cell,2,1)) returns Code number for second letter

=代码(MID(cell,2,1))返回第二个字母的代码号。

= IFERROR(CODE(MID(cell,9,1)) If 9th character does not exist then return 0

= IFERROR(代码(MID(cell,9,1)))如果第9个字符不存在,则返回0

= LEN(cell) number of character in a cell

= LEN(单元格)单元格中的字符数

Concatenating firs 8 codes + adding length of character on the end

连接firs 8代码+在末尾添加字符长度。

If 8 character is not enough, then replicate additional codes for next characters in a string.

如果8个字符还不够,则为字符串中的下一个字符复制其他代码。

Final function:

最后的函数:

=CODE(B2)&IFERROR(CODE(MID(B2,2,1)),0)&IFERROR(CODE(MID(B2,3,1)),0)&IFERROR(CODE(MID(B2,4,1)),0)&IFERROR(CODE(MID(B2,5,1)),0)&IFERROR(CODE(MID(B2,6,1)),0)&IFERROR(CODE(MID(B2,7,1)),0)&IFERROR(CODE(MID(B2,8,1)),0)&LEN(B2)

Excel 2007 -生成基于文本的唯一ID ?

#3


0  

May be OTT for your needs, but you can use a call to CoCreateGuid to get a real GUID

你的需要可能是OTT,但是你可以用CoCreateGuid来得到一个真正的GUID吗

Private Declare Function CoCreateGuid Lib "ole32" (ID As Any) As Long

Function GUID() As String
    Dim ID(0 To 15) As Byte
    Dim i As Long

    If CoCreateGuid(ID(0)) = 0 Then
        For i = 0 To 15
            GUID = GUID & Format(Hex$(ID(i)), "00")
        Next
    Else
        GUID = "Error while creating GUID!"
    End If

End Function

Test using

测试使用

Sub testGUID()
    MsgBox GUID
End Sub

How to best implement depends on your needs. One way would be to write a macro to get a GUID populate a column where names exist. (note, using it as a udf as is is no good, since it will return a new GUID when recalculated)

如何最好地实现取决于您的需要。一种方法是编写一个宏来获得一个GUID填充一个列中名称的存在。(注意,使用它作为udf是不好的,因为在重新计算时它将返回一个新的GUID)

EDIT
See this answer for creating a SHA1 hash of a string

编辑查看创建一个字符串的SHA1散列的答案

#4


0  

Do you just want an incrementing numeric id column to sit next to your values? If so, and if your values will always be unique, you can very easily do this with formulae.

您是否只想要一个递增的数字id列,以便与您的值相邻?如果是这样,如果你的值总是唯一的,你可以很容易地用公式来做。

If your values were in column B, starting in B2 underneath your headers for example, in A2 you would type the formula "=IF(B2="","",1+MAX(A$1:A1))". You can copy and paste that down as far as your data extends, and it will increment a numeric identifier for each row in column B which isn't blank.

如果你的值在B列,例如,在A2页眉下方的B2中,你可以输入公式“= If (B2="" "" "" " 1+MAX(A$1:A1) "您可以在数据扩展时复制并粘贴它,并且它将为列B中的每一行增加一个数字标识符,这不是空白的。

If you need to do anything more complicated, like identify and re-identify repeating values, or make identifiers 'freeze' once they're populated, let me know. Currently, when you clear or add values to your list the identifers will toggle themselves up and down, so you need to be careful if your data changes.

如果您需要做一些更复杂的事情,比如识别和重新标识重复的值,或者在标识符被填充后使它们“冻结”,请让我知道。目前,当您清除或向列表中添加值时,标识符会上下切换,所以如果您的数据发生更改,您需要小心。

#5


0  

Unique identifier based on the number of specific characters in text. I used an identifier based on vowels and numbers.

基于文本中特定字符数的唯一标识符。我使用了基于元音和数字的标识符。

=LEN($J$14)-LEN(SUBSTITUTE($J$14;"a";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"e";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"i";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"j";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"o";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"u";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"y";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"1";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"2";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"3";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"4";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"5";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"6";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"7";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"8";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"9";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"0";""))

#6


0  

You say you are confident that there are no duplicate values in your words. To push it further, are you confident that the first 8 characters in any word would be unique?

你说你有信心你的话语中没有重复的价值。为了进一步推广,你是否相信任何单词的前8个字符都是唯一的?

If so, you can use the below formula. It works by individually taking each character's ASCII code - 40 [assuming normal characters, this puts numbers at between 8 & 57, and letters at between 57 & 122], and multiplying that characters code by 10 ^ [that character's digit placement in the word]. Basically it takes that character code [-40], and concatenates each code onto the next.

如果是,你可以使用下面的公式。它通过单独每个字符的ASCII代码- 40(假设正常的角色,这使得数字8 & 57岁和字母之间57 & 122],和字符代码乘以10 ^(字符数字位置的单词)。基本上,它使用字符代码[-40],并将每个代码连接到下一个。

EDIT Note that this code no longer requires that at least 8 characters exist in your word to prevent an error, as the actual word to be coded has 8 "0"'s appended to it.

编辑注意,该代码不再要求您的word中至少存在8个字符,以防止出现错误,因为要编码的实际单词有8个“0”的后缀。

=TEXT(SUM((CODE(MID(LOWER(RIGHT(REPT("0",8)&A3,8)),{1,2,3,4,5,6,7,8},1))-40)*10^{0,2,4,6,8,10,12,14}),"#")

Note that as this uses the ASCII values of the characters, the ID # could be used to identify the name directly - this does not really create anonymity, it just turns 8 unique characters into a unique number. It is obfuscated with the -40, but not really 'safe' in that sense. The -40 is just to get normal letters and numbers in the 2 digit range, so that multiplying by 10^0,2,4 etc. will create a 2 digit unique add-on to the created code.

注意,由于这使用了字符的ASCII值,ID #可以直接用于标识名称——这并不是真正创建匿名,它只是将8个唯一字符转换为唯一数字。它与-40混淆,但在这个意义上并不是真正的“安全”。-40是正常的2位数的字母和数字,乘以10 ^ 0,2、4等将创建一个2位独特的附加组件创建的代码。

EDIT FOR ALTERNATIVE METHOD

编辑对替代方法

I had previously attempted to do this so that it would look at each letter of the alphabet, count the number of times it appears in the word, and then multiply that by 10*[that letter's position in the alphabet]. The problem with doing this (see comment below for formula) is that it required a number of 10^26-1, which is beyond Excel's floating point precision. However, I have a modified version of that method:

我之前尝试过这样做,这样它就能看到字母表中的每个字母,数出它在单词中出现的次数,然后再乘以10*(那个字母在字母表中的位置)。这样做的问题(参见下面的评论公式)是它需要大量的10 ^银行业,这是超出了Excel的浮点精度。但是,我有一个修改后的方法:

By limiting the number of allowed characters in the alphabet, we can get the max total size possible to 10^15-1, which Excel can properly calculate. The formula looks like this:

通过限制允许字母字符的数量,我们可以得到的最大总大小可能10 ^ 15 - 1,Excel可以正确计算。公式是这样的:

=RIGHT(REPT("0",15)&TEXT(SUM(LEN(A3)*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}-LEN(SUBSTITUTE(A3,MID(Alphabet,{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15},1),""))*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}),"#"),15)

[The RIGHT("00000000000000"... portion of the formula is meant to keep all codes the same number of characters]

(正确的(“00000000000000”……这个公式的一部分是为了使所有的代码保持相同的字符数]

Note that here, Alphabet is a named string which holds the characters: "abcdehilmnorstu". For example, using the above formula, the word "asdf" counts the instances of a, s, and d, but not 'f' which isn't in my contracted alphabet. The code of "asdf" would be:

请注意,这里的Alphabet是一个命名字符串,它包含以下字符:“abcdehilmnorstu”。例如,使用上面的公式,单词“asdf”计算a、s和d的实例,而不是“f”,它不在我的缩写字母表中。“asdf”的代码是:

001000000001001

001000000001001

This only works with the following assumptions:

这只适用于以下假设:

The letters not listed (nor numbers / special characters) are not required to make each name unique. For example, asdf & asd would have the same code in the above method.

未列出的字母(或数字/特殊字符)不需要使每个名称都是唯一的。例如,asdf和asd在上述方法中有相同的代码。

And,

而且,

The order of the letters is not required to make each name unique. For example, asd & dsa would have the same code in the above method.

字母的顺序并不要求每个名字都是唯一的。例如,asd和dsa在上述方法中有相同的代码。