I am sure that this must be quite a common problem so I would guess that Microsoft have already solved the problem. My Googling skills are just not up to scratch. I have a field that I want to order by, it is a varchar field, for example
我确信这一定是一个很常见的问题,所以我猜想微软已经解决了这个问题。我的谷歌搜索能力还不够好。我有一个我想要排序的字段,它是一个varchar字段。
- Q
- 问
- Num 10
- Num 10
- Num 1
- Num 1
- A
- 一个
- Num 9
- Num 9
- Num 2
- Num 2
- F
- F
Now I would expect the result to be
现在我希望结果是
- A
- 一个
- F
- F
- Num 1
- Num 1
- Num 2
- Num 2
- Num 9
- Num 9
- Num 10
- Num 10
- Q
- 问
But it is not. It is as follows (Notice that Num 10 comes after Num 1 and not Num 9 as expected)
但事实并非如此。如下所示(请注意,Num 10位于Num 1之后,而不是按照预期的Num 9)
- A
- 一个
- F
- F
- Num 1
- Num 1
- Num 10
- Num 10
- Num 2
- Num 2
- Num 9
- Num 9
- Q
- 问
Now I know the reason for this so you don't need to explain :) But I can't remember how to solve it or if there is a nice flag or command that I can use to get it right.
现在我知道原因了,所以你不需要解释:)但是我不记得如何解决它,或者如果有一个漂亮的标志或命令,我可以用它来做正确的事情。
EDIT:
编辑:
The examples above are just an example. The column could contain any value. Any combination of letters and digits. Is there a way to sort this humanly alphabetically instead of ASCII value alphabetically?
上面的例子只是一个例子。列可以包含任何值。字母和数字的任何组合。有办法按字母顺序排序而不是按字母顺序排序ASCII值吗?
EDIT 2: Thanks for the answers so far. I am talking about ANY arbitary data. If it were in a fixed position or preceded by something then it would easy and I wouldn't be asking. I am asking for a general solution to this problem with ANY arbitary data. Not patterns, no rules, no nothing.
编辑2:谢谢你的回答。我说的是任意数据。如果它处于一个固定的位置,或者在它之前有什么东西,那么它就会很容易,我不会问。我要求用任意数据来解决这个问题。没有模式,没有规则,什么都没有。
4 个解决方案
#1
2
This is an age old problem of Ascii Sort Order vs. Natural Sort Order
这是一个古老的Ascii排序和自然排序的问题
See http://www.codinghorror.com/blog/archives/001018.html for further details.
详情请参阅http://www.codinghorror.com/blog/archives/001018.html。
#2
1
You added
你添加
The column could contain any value. Any combination of letters and digits
列可以包含任何值。字母和数字的任何组合
So, where do you want "foo1bar" and "foo10bar" for example? Or "foo10bar11" and "foo10bar1"? Or "Foo Two" and "Foo Three"?
那么,你想要"foo1bar"还是"foo10bar" ?或“foo10bar11”和“foo10bar1”?或者"Foo 2 "和"Foo 3 "?
There is no sensible solution without sensible data. You have random data. Define "human readable".
没有明智的数据就没有明智的解决方案。你有随机数据。定义“人类可读”。
#3
1
"I am asking for a general solution to this problem with ANY arbitary data. Not patterns, no rules, no nothing."
“我要求用任意数据来解决这个问题。没有模式,没有规则,什么都没有。
The problem is, programming is all about finding patterns, deriving rules from the patterns and applying solutions based on those rules. So without those prerequisites your question is pretty tough.
问题是,编程都是关于寻找模式、从模式派生规则并基于这些规则应用解决方案。如果没有这些先决条件,你的问题就很难回答了。
Essentially what you have to do is tokenize your sort string into chunks of pure letters and chunks of pure digits, and apply a different sort order to each category. That is doable providing you have some kind of pattern e.g.
本质上,您需要做的是将排序字符串标记为纯字母块和纯数字块,并对每个类别应用不同的排序顺序。如果你有某种模式,这是可行的。
AAA999AA
A9AAAAA
A999A
but it would require a bespoke solution for each pattern. A general solution for any arbitrary arrangement of data is a big ask.
但每一种模式都需要量身定做的解决方案。对任意数据排列的一般解决方案是一个大问题。
#4
1
If the field always has the number at the end with possibly one word before it, and a space before it, you could use CHARINDEX/SUBSTRING to solve this.
如果字段的末尾总是有数字,前面可能只有一个单词,前面有空格,那么可以使用CHARINDEX/SUBSTRING来解决这个问题。
Here is an example:
这是一个例子:
select *
from (
select 'Q' x
union
select 'Num 10'
union
select 'Num 1'
union
select 'A'
union
select 'Num 9'
union
select 'Num 2'
union
select 'F'
) a
order by
case
when CHARINDEX(' ', x) <> 0 then LEFT(x, CHARINDEX(' ', x) - 1)
else x
end,
cast(case
when CHARINDEX(' ', x) <> 0 then substring(x, CHARINDEX(' ', x) + 1, LEN(x) - CHARINDEX(' ', x) )
else ''
end as int)
The output from this is:
它的输出是:
A
F
Num 1
Num 2
Num 9
Num 10
Q
Edit:
编辑:
Since your data is not consistent enough to use a hard-coded approach, the solution calls for more drastic measures. I have experimented with T-SQL based functions that will give a form of natural sort, but found them to be far too slow to be usable. Instead, I wrote a CLR based function and it performs very well. The function returns a scalar value that you can sort on. You'll find the code and installation instructions at over here.
由于您的数据不够一致,无法使用硬编码方法,因此解决方案需要更激烈的措施。我曾尝试过基于T-SQL的函数,这些函数将提供一种自然排序的形式,但发现它们太慢,无法使用。相反,我编写了一个基于CLR的函数,它的性能非常好。函数返回可以排序的标量值。你可以在这里找到代码和安装说明。
#1
2
This is an age old problem of Ascii Sort Order vs. Natural Sort Order
这是一个古老的Ascii排序和自然排序的问题
See http://www.codinghorror.com/blog/archives/001018.html for further details.
详情请参阅http://www.codinghorror.com/blog/archives/001018.html。
#2
1
You added
你添加
The column could contain any value. Any combination of letters and digits
列可以包含任何值。字母和数字的任何组合
So, where do you want "foo1bar" and "foo10bar" for example? Or "foo10bar11" and "foo10bar1"? Or "Foo Two" and "Foo Three"?
那么,你想要"foo1bar"还是"foo10bar" ?或“foo10bar11”和“foo10bar1”?或者"Foo 2 "和"Foo 3 "?
There is no sensible solution without sensible data. You have random data. Define "human readable".
没有明智的数据就没有明智的解决方案。你有随机数据。定义“人类可读”。
#3
1
"I am asking for a general solution to this problem with ANY arbitary data. Not patterns, no rules, no nothing."
“我要求用任意数据来解决这个问题。没有模式,没有规则,什么都没有。
The problem is, programming is all about finding patterns, deriving rules from the patterns and applying solutions based on those rules. So without those prerequisites your question is pretty tough.
问题是,编程都是关于寻找模式、从模式派生规则并基于这些规则应用解决方案。如果没有这些先决条件,你的问题就很难回答了。
Essentially what you have to do is tokenize your sort string into chunks of pure letters and chunks of pure digits, and apply a different sort order to each category. That is doable providing you have some kind of pattern e.g.
本质上,您需要做的是将排序字符串标记为纯字母块和纯数字块,并对每个类别应用不同的排序顺序。如果你有某种模式,这是可行的。
AAA999AA
A9AAAAA
A999A
but it would require a bespoke solution for each pattern. A general solution for any arbitrary arrangement of data is a big ask.
但每一种模式都需要量身定做的解决方案。对任意数据排列的一般解决方案是一个大问题。
#4
1
If the field always has the number at the end with possibly one word before it, and a space before it, you could use CHARINDEX/SUBSTRING to solve this.
如果字段的末尾总是有数字,前面可能只有一个单词,前面有空格,那么可以使用CHARINDEX/SUBSTRING来解决这个问题。
Here is an example:
这是一个例子:
select *
from (
select 'Q' x
union
select 'Num 10'
union
select 'Num 1'
union
select 'A'
union
select 'Num 9'
union
select 'Num 2'
union
select 'F'
) a
order by
case
when CHARINDEX(' ', x) <> 0 then LEFT(x, CHARINDEX(' ', x) - 1)
else x
end,
cast(case
when CHARINDEX(' ', x) <> 0 then substring(x, CHARINDEX(' ', x) + 1, LEN(x) - CHARINDEX(' ', x) )
else ''
end as int)
The output from this is:
它的输出是:
A
F
Num 1
Num 2
Num 9
Num 10
Q
Edit:
编辑:
Since your data is not consistent enough to use a hard-coded approach, the solution calls for more drastic measures. I have experimented with T-SQL based functions that will give a form of natural sort, but found them to be far too slow to be usable. Instead, I wrote a CLR based function and it performs very well. The function returns a scalar value that you can sort on. You'll find the code and installation instructions at over here.
由于您的数据不够一致,无法使用硬编码方法,因此解决方案需要更激烈的措施。我曾尝试过基于T-SQL的函数,这些函数将提供一种自然排序的形式,但发现它们太慢,无法使用。相反,我编写了一个基于CLR的函数,它的性能非常好。函数返回可以排序的标量值。你可以在这里找到代码和安装说明。