如何在JavaScript中对字符串排序

时间:2021-02-18 15:59:29

I have a list of objects I wish to sort based on a field attr of type string. I tried using -

我有一个对象列表,我希望根据类型为string的字段attr进行排序。我试着用,

list.sort(function (a, b) {
    return a.attr - b.attr
})

but found that - doesn't appear to work with strings in JavaScript. How can I sort a list of objects based on an attribute with type string?

但是发现-在JavaScript中似乎不能处理字符串。如何基于具有字符串类型的属性对对象列表进行排序?

8 个解决方案

#1


352  

Use String.prototype.localeCompare a per your example:

使用String.prototype。localeCompare符合你的例子:

list.sort(function (a, b) {
    return ('' + a.attr).localeCompare(b.attr);
})

We force a.attr to be a string to avoid exceptions. localeCompare has been supported since Internet Explorer 6 and Firefox 1. You may also see the following code used that doesn't respect a locale:

我们迫使。attr为字符串,以避免异常。从Internet Explorer 6和Firefox 1开始就支持localeCompare。您可能还会看到下面使用的不尊重语言环境的代码:

if (item1.attr < item2.attr)
  return -1;
if ( item1.attr > item2.attr)
  return 1;
return 0;

#2


107  

An updated answer (October 2014)

I was really annoyed about this string natural sorting order so I took quite some time to investigate this issue. I hope this helps.

我很讨厌这个字符串自然排序所以我花了很多时间来研究这个问题。我希望这可以帮助。

Long story short

localeCompare() character support is badass, just use it. As pointed out by Shog9, the answer to your question is:

localeCompare()字符支持是badass,只是使用它。正如Shog9所指出的,你的问题的答案是:

return item1.attr.localeCompare(item2.attr);

Bugs found in all the custom javascript "natural string sort order" implementations

There are quite a bunch of custom implementations out there, trying to do string comparison more precisely called "natural string sort order"

有很多的自定义实现,尝试做字符串比较更精确地称为"自然字符串排序"

When "playing" with these implementations, I always noticed some strange "natural sorting order" choice, or rather mistakes (or omissions in the best cases).

在使用这些实现时,我总是注意到一些奇怪的“自然排序顺序”选择,或者错误(最好的情况下是省略)。

Typically, special characters (space, dash, ampersand, brackets, and so on) are not processed correctly.

通常,特殊字符(空格、破折号、ampersand、括号等)没有正确处理。

You will then find them appearing mixed up in different places, typically that could be:

然后你会发现它们出现在不同的地方,通常是:

  • some will be between the uppercase 'Z' and the lowercase 'a'
  • 一些在大写字母Z和小写字母a之间
  • some will be between the '9' and the uppercase 'A'
  • 有些会在'9'和'A'大写之间
  • some will be after lowercase 'z'
  • 有些会在小写的z后面

When one would have expected special characters to all be "grouped" together in one place, except for the space special character maybe (which would always be the first character). That is, either all before numbers, or all between numbers and letters (lowercase & uppercase being "together" one after another), or all after letters.

当一个人期望所有的特殊字符都被“分组”在一个地方,除了空间的特殊字符可能(这将永远是第一个字符)。也就是说,要么所有在数字之前,要么所有在数字和字母之间(小写和大写分别表示“一起”),要么所有在字母之后。

My conclusion is that they all fail to provide a consistent order when I start adding barely unusual characters (ie. characters with diacritics or charcters such as dash, exclamation mark and so on).

我的结论是,当我开始添加几乎不常见的字符时,它们都不能提供一致的顺序。具有变音符号或字符的字符,如破折号、感叹号等)。

Research on the custom implementations:

定制实现研究:

Browsers' native "natural string sort order" implementations via localeCompare()

localeCompare() oldest implementation (without the locales and options arguments) is supported by IE6+, see http://msdn.microsoft.com/en-us/library/ie/s4esdbwz(v=vs.94).aspx (scroll down to localeCompare() method). The built-in localeCompare() method does a much better job at sorting, even international & special characters. The only problem using the localeCompare() method is that "the locale and sort order used are entirely implementation dependent". In other words, when using localeCompare such as stringOne.localeCompare(stringTwo): Firefox, Safari, Chrome & IE have a different sort order for Strings.

localeCompare()最老的实现(没有locale和选项参数)由IE6+支持,请参见http://msdn.microsoft.com/en-us/library/ie/s4esdbwz(v=vs.94).aspx(向下滚动到localeCompare()方法)。内置的localeCompare()方法在排序方面做得更好,甚至在国际和特殊字符方面也做得更好。使用localeCompare()方法的唯一问题是“使用的语言环境和排序顺序完全依赖于实现”。换句话说,当使用localeCompare,例如stringOne.localeCompare(stringTwo): Firefox、Safari、Chrome和IE对字符串有不同的排序顺序。

Research on the browser-native implementations:

关于浏览器本地实现的研究:

Difficulty of "string natural sorting order"

Implementing a solid algorithm (meaning: consistent but also covering a wide range of characters) is a very tough task. UTF8 contains more than 2000 characters & covers more than 120 scripts (languages). Finally, there are some specification for this tasks, it is called the "Unicode Collation Algorithm", which can be found at http://www.unicode.org/reports/tr10/ . You can find more information about this on this question I posted https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order

实现一个可靠的算法(意思是:一致的,但也要覆盖广泛的字符)是一项非常艰巨的任务。UTF8包含超过2000个字符,涵盖超过120个脚本(语言)。最后,对于这个任务有一些规范,称为“Unicode排序算法”,可以在http://www.unicode.org/reports/tr10/上找到。关于这个问题,您可以找到更多的信息,我发布了https://softwareengineer.stackexchange.com/questions/257286/is -there-any-language- agnostiction -specific -for- natural sororder

Final conclusion

So considering the current level of support provided by the javascript custom implementations I came across, we will probably never see anything getting any close to supporting all this characters & scripts (languages). Hence I would rather use the browsers' native localeCompare() method. Yes, it does have the downside of beeing non-consistent across browsers but basic testing shows it covers a much wider range of characters, allowing solid & meaningful sort orders.

因此,考虑到我遇到的javascript自定义实现提供的当前支持级别,我们可能永远也不会看到任何支持所有这些字符和脚本(语言)的东西。因此,我宁愿使用浏览器的本机localeCompare()方法。是的,它确实有跨浏览器的不一致的缺点,但是基本的测试显示它覆盖了更广泛的字符范围,允许可靠和有意义的排序顺序。

So as pointed out by Shog9, the answer to your question is:

正如Shog9所指出的,你的问题的答案是:

return item1.attr.localeCompare(item2.attr);

Further reading:

Thanks to Shog9's nice answer, which put me in the "right" direction I believe

多亏了幕府将军的漂亮回答,我相信这是正确的方向

#3


9  

Simplest Answer with ECMAScript 2016

最简单的答案是2016年的ECMAScript

list.sort((a, b) => (a.attr > b.attr) - (a.attr < b.attr))

Or

list.sort((a, b) => +(a.attr > b.attr) || -(a.attr < b.attr))

#4


6  

You should use > or < and == here. So the solution would be:

您应该在这里使用>或< and =。所以解决办法是:

list.sort(function(item1, item2) {
    var val1 = item1.attr,
        val2 = item2.attr;
    if (val1 == val2) return 0;
    if (val1 > val2) return 1;
    if (val1 < val2) return -1;
});

#5


4  

I had been bothered about this for long, so I finally researched this and give you this long winded reason for why things are the way they are.

我为此烦恼了很长时间,所以我最终研究了这个问题,并给出了为什么事情是这样的冗长理由。

From the spec:

从规范:

Section 11.9.4   The Strict Equals Operator ( === )

The production EqualityExpression : EqualityExpression === RelationalExpression
is evaluated as follows: 
- Let lref be the result of evaluating EqualityExpression.
- Let lval be GetValue(lref).
- Let rref be the result of evaluating RelationalExpression.
- Let rval be GetValue(rref).
- Return the result of performing the strict equality comparison 
  rval === lval. (See 11.9.6)

So now we go to 11.9.6

现在看11。6

11.9.6   The Strict Equality Comparison Algorithm

The comparison x === y, where x and y are values, produces true or false. 
Such a comparison is performed as follows: 
- If Type(x) is different from Type(y), return false.
- If Type(x) is Undefined, return true.
- If Type(x) is Null, return true.
- If Type(x) is Number, then
...
- If Type(x) is String, then return true if x and y are exactly the 
  same sequence of characters (same length and same characters in 
  corresponding positions); otherwise, return false.

That's it. The triple equals operator applied to strings returns true iff the arguments are exactly the same strings (same length and same characters in corresponding positions).

就是这样。应用于字符串的三重equals运算符返回true iff,参数完全相同(相同长度和相同字符在相应位置)。

So === will work in the cases when we're trying to compare strings which might have arrived from different sources, but which we know will eventually have the same values - a common enough scenario for inline strings in our code. For example, if we have a variable named connection_state, and we wish to know which one of the following states ['connecting', 'connected', 'disconnecting', 'disconnected'] is it in right now, we can directly use the ===.

所以==将在我们试图比较字符串时起作用,这些字符串可能来自不同的源,但是我们知道它们最终会有相同的值——这是我们代码中足够常见的内联字符串场景。例如,如果我们有一个名为connection_state的变量,我们想知道下面的状态[' connected', 'connected', 'disconnected', 'disconnect ']是现在,我们可以直接使用=== =。

But there's more. Just above 11.9.4, there is a short note:

但还有更多。在11.9.4之上,有一个简短的提示:

NOTE 4     
  Comparison of Strings uses a simple equality test on sequences of code 
  unit values. There is no attempt to use the more complex, semantically oriented
  definitions of character or string equality and collating order defined in the 
  Unicode specification. Therefore Strings values that are canonically equal
  according to the Unicode standard could test as unequal. In effect this 
  algorithm assumes that both Strings are already in normalized form.

Hmm. What now? Externally obtained strings can, and most likely will, be weird unicodey, and our gentle === won't do them justice. In comes localeCompare to the rescue:

嗯。现在该做什么?外部获得的字符串可以而且很有可能是怪异的独角兽,而我们的gentle === =不太适合它们。当地*来营救:

15.5.4.9   String.prototype.localeCompare (that)
    ...
    The actual return values are implementation-defined to permit implementers 
    to encode additional information in the value, but the function is required 
    to define a total ordering on all Strings and to return 0 when comparing
    Strings that are considered canonically equivalent by the Unicode standard. 

We can go home now.

我们现在可以回家了。

tl;dr;

tl,博士;

To compare strings in javascript, use localeCompare; if you know that the strings have no non-ASCII components because they are, for example, internal program constants, then === also works.

要比较javascript中的字符串,请使用localeCompare;如果您知道字符串没有非ascii组件,因为它们是内部程序常量,那么===也可以。

#6


0  

In your operation in your initial question, you are performing the following operation:

在你第一个问题的操作中,你正在执行以下操作:

item1.attr - item2.attr

So, assuming those are numbers (i.e. item1.attr = "1", item2.attr = "2") You still may use the "===" operator (or other strict evaluators) provided that you ensure type. The following should work:

假设这些是数字(即item1)attr =“1”,第二条。如果您确保类型,您仍然可以使用“===”运算符(或其他严格的求值符)。以下工作:

return parseInt(item1.attr) - parseInt(item2.attr);

If they are alphaNumeric, then do use localCompare().

如果它们是字母数字,那么请使用localCompare()。

#7


0  

list.sort(function(item1, item2){
    return +(item1.attr > item2.attr) || +(item1.attr === item2.attr) - 1;
}) 

How they work samples:

它们是如何工作的样本:

+('aaa'>'bbb')||+('aaa'==='bbb')-1
+(false)||+(false)-1
0||0-1
-1

+('bbb'>'aaa')||+('bbb'==='aaa')-1
+(true)||+(false)-1
1||0-1
1

+('aaa'>'aaa')||+('aaa'==='aaa')-1
+(false)||+(true)-1
0||1-1
0

#8


0  

<!doctype html>
<html>
<body>
<p id = "myString">zyxtspqnmdba</p>
<p id = "orderedString"></p>
<script>
var myString = document.getElementById("myString").innerHTML;
orderString(myString);
function orderString(str) {
    var i = 0;
    var myArray = str.split("");
    while (i < str.length){
        var j = i + 1;
        while (j < str.length) {
            if (myArray[j] < myArray[i]){
                var temp = myArray[i];
                myArray[i] = myArray[j];
                myArray[j] = temp;
            }
            j++;
        }
        i++;
    }
    var newString = myArray.join("");
    document.getElementById("orderedString").innerHTML = newString;
}
</script>
</body>
</html>

#1


352  

Use String.prototype.localeCompare a per your example:

使用String.prototype。localeCompare符合你的例子:

list.sort(function (a, b) {
    return ('' + a.attr).localeCompare(b.attr);
})

We force a.attr to be a string to avoid exceptions. localeCompare has been supported since Internet Explorer 6 and Firefox 1. You may also see the following code used that doesn't respect a locale:

我们迫使。attr为字符串,以避免异常。从Internet Explorer 6和Firefox 1开始就支持localeCompare。您可能还会看到下面使用的不尊重语言环境的代码:

if (item1.attr < item2.attr)
  return -1;
if ( item1.attr > item2.attr)
  return 1;
return 0;

#2


107  

An updated answer (October 2014)

I was really annoyed about this string natural sorting order so I took quite some time to investigate this issue. I hope this helps.

我很讨厌这个字符串自然排序所以我花了很多时间来研究这个问题。我希望这可以帮助。

Long story short

localeCompare() character support is badass, just use it. As pointed out by Shog9, the answer to your question is:

localeCompare()字符支持是badass,只是使用它。正如Shog9所指出的,你的问题的答案是:

return item1.attr.localeCompare(item2.attr);

Bugs found in all the custom javascript "natural string sort order" implementations

There are quite a bunch of custom implementations out there, trying to do string comparison more precisely called "natural string sort order"

有很多的自定义实现,尝试做字符串比较更精确地称为"自然字符串排序"

When "playing" with these implementations, I always noticed some strange "natural sorting order" choice, or rather mistakes (or omissions in the best cases).

在使用这些实现时,我总是注意到一些奇怪的“自然排序顺序”选择,或者错误(最好的情况下是省略)。

Typically, special characters (space, dash, ampersand, brackets, and so on) are not processed correctly.

通常,特殊字符(空格、破折号、ampersand、括号等)没有正确处理。

You will then find them appearing mixed up in different places, typically that could be:

然后你会发现它们出现在不同的地方,通常是:

  • some will be between the uppercase 'Z' and the lowercase 'a'
  • 一些在大写字母Z和小写字母a之间
  • some will be between the '9' and the uppercase 'A'
  • 有些会在'9'和'A'大写之间
  • some will be after lowercase 'z'
  • 有些会在小写的z后面

When one would have expected special characters to all be "grouped" together in one place, except for the space special character maybe (which would always be the first character). That is, either all before numbers, or all between numbers and letters (lowercase & uppercase being "together" one after another), or all after letters.

当一个人期望所有的特殊字符都被“分组”在一个地方,除了空间的特殊字符可能(这将永远是第一个字符)。也就是说,要么所有在数字之前,要么所有在数字和字母之间(小写和大写分别表示“一起”),要么所有在字母之后。

My conclusion is that they all fail to provide a consistent order when I start adding barely unusual characters (ie. characters with diacritics or charcters such as dash, exclamation mark and so on).

我的结论是,当我开始添加几乎不常见的字符时,它们都不能提供一致的顺序。具有变音符号或字符的字符,如破折号、感叹号等)。

Research on the custom implementations:

定制实现研究:

Browsers' native "natural string sort order" implementations via localeCompare()

localeCompare() oldest implementation (without the locales and options arguments) is supported by IE6+, see http://msdn.microsoft.com/en-us/library/ie/s4esdbwz(v=vs.94).aspx (scroll down to localeCompare() method). The built-in localeCompare() method does a much better job at sorting, even international & special characters. The only problem using the localeCompare() method is that "the locale and sort order used are entirely implementation dependent". In other words, when using localeCompare such as stringOne.localeCompare(stringTwo): Firefox, Safari, Chrome & IE have a different sort order for Strings.

localeCompare()最老的实现(没有locale和选项参数)由IE6+支持,请参见http://msdn.microsoft.com/en-us/library/ie/s4esdbwz(v=vs.94).aspx(向下滚动到localeCompare()方法)。内置的localeCompare()方法在排序方面做得更好,甚至在国际和特殊字符方面也做得更好。使用localeCompare()方法的唯一问题是“使用的语言环境和排序顺序完全依赖于实现”。换句话说,当使用localeCompare,例如stringOne.localeCompare(stringTwo): Firefox、Safari、Chrome和IE对字符串有不同的排序顺序。

Research on the browser-native implementations:

关于浏览器本地实现的研究:

Difficulty of "string natural sorting order"

Implementing a solid algorithm (meaning: consistent but also covering a wide range of characters) is a very tough task. UTF8 contains more than 2000 characters & covers more than 120 scripts (languages). Finally, there are some specification for this tasks, it is called the "Unicode Collation Algorithm", which can be found at http://www.unicode.org/reports/tr10/ . You can find more information about this on this question I posted https://softwareengineering.stackexchange.com/questions/257286/is-there-any-language-agnostic-specification-for-string-natural-sorting-order

实现一个可靠的算法(意思是:一致的,但也要覆盖广泛的字符)是一项非常艰巨的任务。UTF8包含超过2000个字符,涵盖超过120个脚本(语言)。最后,对于这个任务有一些规范,称为“Unicode排序算法”,可以在http://www.unicode.org/reports/tr10/上找到。关于这个问题,您可以找到更多的信息,我发布了https://softwareengineer.stackexchange.com/questions/257286/is -there-any-language- agnostiction -specific -for- natural sororder

Final conclusion

So considering the current level of support provided by the javascript custom implementations I came across, we will probably never see anything getting any close to supporting all this characters & scripts (languages). Hence I would rather use the browsers' native localeCompare() method. Yes, it does have the downside of beeing non-consistent across browsers but basic testing shows it covers a much wider range of characters, allowing solid & meaningful sort orders.

因此,考虑到我遇到的javascript自定义实现提供的当前支持级别,我们可能永远也不会看到任何支持所有这些字符和脚本(语言)的东西。因此,我宁愿使用浏览器的本机localeCompare()方法。是的,它确实有跨浏览器的不一致的缺点,但是基本的测试显示它覆盖了更广泛的字符范围,允许可靠和有意义的排序顺序。

So as pointed out by Shog9, the answer to your question is:

正如Shog9所指出的,你的问题的答案是:

return item1.attr.localeCompare(item2.attr);

Further reading:

Thanks to Shog9's nice answer, which put me in the "right" direction I believe

多亏了幕府将军的漂亮回答,我相信这是正确的方向

#3


9  

Simplest Answer with ECMAScript 2016

最简单的答案是2016年的ECMAScript

list.sort((a, b) => (a.attr > b.attr) - (a.attr < b.attr))

Or

list.sort((a, b) => +(a.attr > b.attr) || -(a.attr < b.attr))

#4


6  

You should use > or < and == here. So the solution would be:

您应该在这里使用>或< and =。所以解决办法是:

list.sort(function(item1, item2) {
    var val1 = item1.attr,
        val2 = item2.attr;
    if (val1 == val2) return 0;
    if (val1 > val2) return 1;
    if (val1 < val2) return -1;
});

#5


4  

I had been bothered about this for long, so I finally researched this and give you this long winded reason for why things are the way they are.

我为此烦恼了很长时间,所以我最终研究了这个问题,并给出了为什么事情是这样的冗长理由。

From the spec:

从规范:

Section 11.9.4   The Strict Equals Operator ( === )

The production EqualityExpression : EqualityExpression === RelationalExpression
is evaluated as follows: 
- Let lref be the result of evaluating EqualityExpression.
- Let lval be GetValue(lref).
- Let rref be the result of evaluating RelationalExpression.
- Let rval be GetValue(rref).
- Return the result of performing the strict equality comparison 
  rval === lval. (See 11.9.6)

So now we go to 11.9.6

现在看11。6

11.9.6   The Strict Equality Comparison Algorithm

The comparison x === y, where x and y are values, produces true or false. 
Such a comparison is performed as follows: 
- If Type(x) is different from Type(y), return false.
- If Type(x) is Undefined, return true.
- If Type(x) is Null, return true.
- If Type(x) is Number, then
...
- If Type(x) is String, then return true if x and y are exactly the 
  same sequence of characters (same length and same characters in 
  corresponding positions); otherwise, return false.

That's it. The triple equals operator applied to strings returns true iff the arguments are exactly the same strings (same length and same characters in corresponding positions).

就是这样。应用于字符串的三重equals运算符返回true iff,参数完全相同(相同长度和相同字符在相应位置)。

So === will work in the cases when we're trying to compare strings which might have arrived from different sources, but which we know will eventually have the same values - a common enough scenario for inline strings in our code. For example, if we have a variable named connection_state, and we wish to know which one of the following states ['connecting', 'connected', 'disconnecting', 'disconnected'] is it in right now, we can directly use the ===.

所以==将在我们试图比较字符串时起作用,这些字符串可能来自不同的源,但是我们知道它们最终会有相同的值——这是我们代码中足够常见的内联字符串场景。例如,如果我们有一个名为connection_state的变量,我们想知道下面的状态[' connected', 'connected', 'disconnected', 'disconnect ']是现在,我们可以直接使用=== =。

But there's more. Just above 11.9.4, there is a short note:

但还有更多。在11.9.4之上,有一个简短的提示:

NOTE 4     
  Comparison of Strings uses a simple equality test on sequences of code 
  unit values. There is no attempt to use the more complex, semantically oriented
  definitions of character or string equality and collating order defined in the 
  Unicode specification. Therefore Strings values that are canonically equal
  according to the Unicode standard could test as unequal. In effect this 
  algorithm assumes that both Strings are already in normalized form.

Hmm. What now? Externally obtained strings can, and most likely will, be weird unicodey, and our gentle === won't do them justice. In comes localeCompare to the rescue:

嗯。现在该做什么?外部获得的字符串可以而且很有可能是怪异的独角兽,而我们的gentle === =不太适合它们。当地*来营救:

15.5.4.9   String.prototype.localeCompare (that)
    ...
    The actual return values are implementation-defined to permit implementers 
    to encode additional information in the value, but the function is required 
    to define a total ordering on all Strings and to return 0 when comparing
    Strings that are considered canonically equivalent by the Unicode standard. 

We can go home now.

我们现在可以回家了。

tl;dr;

tl,博士;

To compare strings in javascript, use localeCompare; if you know that the strings have no non-ASCII components because they are, for example, internal program constants, then === also works.

要比较javascript中的字符串,请使用localeCompare;如果您知道字符串没有非ascii组件,因为它们是内部程序常量,那么===也可以。

#6


0  

In your operation in your initial question, you are performing the following operation:

在你第一个问题的操作中,你正在执行以下操作:

item1.attr - item2.attr

So, assuming those are numbers (i.e. item1.attr = "1", item2.attr = "2") You still may use the "===" operator (or other strict evaluators) provided that you ensure type. The following should work:

假设这些是数字(即item1)attr =“1”,第二条。如果您确保类型,您仍然可以使用“===”运算符(或其他严格的求值符)。以下工作:

return parseInt(item1.attr) - parseInt(item2.attr);

If they are alphaNumeric, then do use localCompare().

如果它们是字母数字,那么请使用localCompare()。

#7


0  

list.sort(function(item1, item2){
    return +(item1.attr > item2.attr) || +(item1.attr === item2.attr) - 1;
}) 

How they work samples:

它们是如何工作的样本:

+('aaa'>'bbb')||+('aaa'==='bbb')-1
+(false)||+(false)-1
0||0-1
-1

+('bbb'>'aaa')||+('bbb'==='aaa')-1
+(true)||+(false)-1
1||0-1
1

+('aaa'>'aaa')||+('aaa'==='aaa')-1
+(false)||+(true)-1
0||1-1
0

#8


0  

<!doctype html>
<html>
<body>
<p id = "myString">zyxtspqnmdba</p>
<p id = "orderedString"></p>
<script>
var myString = document.getElementById("myString").innerHTML;
orderString(myString);
function orderString(str) {
    var i = 0;
    var myArray = str.split("");
    while (i < str.length){
        var j = i + 1;
        while (j < str.length) {
            if (myArray[j] < myArray[i]){
                var temp = myArray[i];
                myArray[i] = myArray[j];
                myArray[j] = temp;
            }
            j++;
        }
        i++;
    }
    var newString = myArray.join("");
    document.getElementById("orderedString").innerHTML = newString;
}
</script>
</body>
</html>