I have an array containing Roman numerals (as strings of course). Like this:
我有一个包含罗马数字的数组(当然是字符串)。是这样的:
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
I'd like to sort them according to the numeric values of these numerals, so the results should be something like:
我想根据这些数字的数值对它们进行排序,所以结果应该是这样的:
$sorted_a = array('III', 'V', 'XIII', 'XIX', 'LII', 'MCCXCIV');
So my question is: what is the best way to sort an array of Roman numerals? I know how to use the array sorting functions of PHP, I'm interested in the logic that goes on inside the comparison function.
我的问题是:排序罗马数字数组的最好方法是什么?我知道如何使用PHP的数组排序函数,我对比较函数中的逻辑很感兴趣。
EDIT: For simplicity, I'm only looking for a way that deals with strings constructed of the basic numerals in a standard way (no CCCC
for example):
编辑:为了简单起见,我只寻找一种方法来处理用标准方式构造的基本数字字符串(例如没有CCCC):
I, V, X, L, C, D, M
TEST RESULTS
测试结果
I took the time to extensively test all the code examples that were posted. Two tests were taken, one with a random array of 20 Roman numerals, and a second with an array containing 4000 of those. Same machine, lot of iterations, an average time taken, and all this run several times. Of course this is nothing offical, just my own tests.
我花了大量时间来测试所有发布的代码示例。进行了两次测试,一次是随机的20个罗马数字,另一次是随机的4000个罗马数字。同样的机器,大量的迭代,平均花费的时间,所有这些运行了几次。当然这不是什么官方的,只是我自己的测试。
TEST WITH 20 NUMERALS:
测试有20个数字:
- hakre, bazmegakapa - around 0.0005 s
- 巴兹迈卡帕,哈格尔——大约0。0005秒
- anemgyenge, Andrea, Dirk McQuickly - around 0.0010 s
- 安内姆金阵,安德里亚,德克·麦克布莱特——大约0.0010秒
- Joe Nelson - around 0.0050 s
- 乔·纳尔逊——大约0。0050秒
- Rob Hruska - around 0.0100 s
- 罗伯·赫鲁斯卡——大约0。0100秒。
TEST WITH 4000 NUMERALS:
测试4000数字:
- hakre, bazmegakapa - around 0.13 s
- hakre, bazmegakapa -约0.13秒
- anemgyenge - around 1.4 s
- 风阵-约1.4秒
- Dirk McQuickly, Andrea - around 1.8 s
- 德克·麦克布莱特,安德烈——大约1.8秒
- Rob Hruska - around 2.8 s
- Rob Hruska——大约2.8秒
- Joe Nelson - around 15 s (surprise, checked several more times)
- 乔·纳尔逊——大约15秒(惊喜,又检查了几次)
I have a hard time awarding the bounty. hakre and I made the fastest versions, following the same route, but he made a variation of mine, which was previously based on borrible's idea. So I will accept hakre's solution, because that is the quickest and nicer than mine (IMO). But I will award the bounty to anemgyenge, because I love his version and a lot of effort seems to be put into it.
我很难颁发赏金。我和hakre按照同样的路线做了最快的版本,但是他对我的版本做了改动,这是基于borrible的想法。所以我会接受hakre的解决方案,因为这是最快的,比我的更好(在我看来)。但我将把赏金颁发给安内姆金,因为我喜欢他的版本,而且似乎付出了很多努力。
10 个解决方案
#1
26
Picking your class to convert roman numbers to integers, a user-defined sort callback can handle this to sort the array:
选择您的类来将罗马数字转换为整数,用户定义的排序回调可以处理它来对数组进行排序:
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
$bool = usort($a, function($a, $b) {
return RomanNumber::Roman2Int($a) - RomanNumber::Roman2Int($b);
});
var_dump($a);
So here you find the logic inside the comparison function: if both values are of the same weight, return 0
. If the first is lower than the second, return < 0
(e.g. -1
), otherwise the second is larger than the first so return > 0
(e.g. 1
).
这里你可以找到比较函数的逻辑:如果两个值的权重相同,返回0。如果第一个小于第二个,返回< 0(如-1),否则第二个大于第一个,因此返回>(如1)。
Naturally any other type of function that returns the decimal value for a roman number would work as well.
当然,任何其他类型的函数返回一个罗马数字的十进制值也会起作用。
Edit:
编辑:
As you commented, you do not want to run the conversion for each pair. That's fine, with a help of an additional array which contains all converted values, you can run the sort on the decimal values and use that sorting on the roman numbers as well (Demo):
正如您所评论的,您不希望为每一对运行转换。没关系,在包含所有转换值的附加数组的帮助下,您可以在十进制值上运行排序,并在罗马数字上使用排序(Demo):
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
$b = array_map('RomanNumber::Roman2Int', $a);
array_multisort($b, $a);
var_dump($a);
array_multisort
PHP Manual does most of the magic here.
array_multisort PHP手册在这里完成了大部分神奇的工作。
#2
10
function sortRomanNum($a, $b) {
if($a == $b) return 0;
$str = "0IVXLCDM";
$len = 0;
if(strlen($a) >= strlen($b)) {
$len = strlen($a);
$b .= str_repeat("0", $len - strlen($b));
}
else {
$len = strlen($b);
$a .= str_repeat("0", $len - strlen($a));
}
for($i = 0; $i < $len - 1; $i++) {
$a1 = $a[$i]; $b1 = $b[$i]; $a2 = $a[$i+1]; $b2 = $b[$i+1];
if( strpos($str, $a1.$b1.$a2) !== false ) return 1;
if( strpos($str, $b1.$a1.$b2) !== false ) return -1;
if($a1 != $b1) return strpos($str, $a1) > strpos($str, $b1) ? 1 : -1;
}
if($a[$i] != $b[$i]) return strpos($str, $a[$i]) > strpos($str, $b[$i]) ? 1 : -1;
}
Given two numbers (roman strings), $a and $b. If there are no substractions in the numbers (IV, IX, XC etc), then the solution would be trivial:
给定两个数字(罗马字符串),$a和$b。如果在数(IV、IX、XC等)中没有减法,则解是平凡的:
for all $i in $a and $b
if $a[$i] > $b[$i] then return 1; //($a is greater then $b)
if $a[$i] < $b[$i] then return 1; //($a is lower then $b)
return 0 //equality
Since there can be these special parts, the calculation is more complex. But the solution is to find the patterns:
因为可以有这些特殊的部分,所以计算起来比较复杂。但解决方法是找出模式:
a: IX | XC | CM
b: V | L | D
These are the only patterns which can mess up the trivial solution. If you find any of these, then $a will be greater then $b.
这些是唯一可以打乱琐碎解决方案的模式。如果你找到其中的任何一个,那么a比b大。
Note, that roman numbers don't include zeros, like the arabic ones. Therefore now we will use them (and basically put zeros where they are missing).
注意,罗马数字不包括0,比如阿拉伯数字。因此,现在我们将使用它们(基本上把0放在它们丢失的地方)。
So here comes the function:
函数是这样的:
if $a == $b then return 0; //equality
create a string for ordering the roman numerals (strpos will give the right index)
define the length of the loop (take the longer string), and add zeros to the end of the shorter number
run the loop, and check:
1. if the patterns above are found, return the comparision accordingly (1 or -1)
2. otherwise do the trivial check (compare each numeral)
check the last numerals too.
#3
4
Some people have suggested converting Roman numerals to integers, sorting, and mapping back. There is an easier way. All that we really need to do is compare any two arbitrary Roman numerals and let usort
do the rest. Here is the code, and I will explain its design below.
有些人建议把罗马数字转换成整数、排序和映射。有一个更简单的方法。我们需要做的就是比较任意两个罗马数字剩下的部分。这是代码,我将在下面解释它的设计。
$base = array( 'I' => 0, 'V' => 1, 'X' => 2, 'L' => 3,
'C' => 4, 'D' => 5, 'M' => 6 );
function single($a) { global $base; return $base[$a]; }
function compare($a, $b) {
global $base;
if(strlen($a) == 0) { return true; }
if(strlen($b) == 0) { return false; }
$maxa = max(array_map('single', str_split($a)));
$maxb = max(array_map('single', str_split($b)));
if($maxa != $maxb) {
return $maxa < $maxb;
}
if($base[$a[0]] != $base[$b[0]]) {
return $base[$a[0]] < $base[$b[0]];
}
return compare(substr($a, 1), substr($b, 1));
}
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
usort($a, compare);
print_r($a);
First we create a lookup array to assign a "magnitude" to single digit Roman numerals. Notice this isn't their decimal value, just numbers assigned in such a way that bigger numerals get bigger values. Then we create a helper function single
used by some PHP functions to to retrieve the magnitudes.
首先,我们创建一个查找数组,将“大小”分配给单个数字的罗马数字。注意,这不是它们的十进制值,只是以这样的方式分配的数字,更大的数字会得到更大的值。然后我们创建一个帮助函数,由一些PHP函数使用它来检索大小。
OK, now to the meat of the algorithm. It is the compare
function which sometimes has to call itself recursively when it needs to break a tie. For this reason, we start with some tests to see if it has reached terminal states in the recursion. Disregard that for now and look at the first interesting test. It checks to see if either numeral being compared has a digit in it that dwarfs any digits of the other. For instance, if one of them has X
in it, and the other only has I
and V
, then the one with X
wins. This relies on the convention that certain Roman numerals are not valid, like VV
or VIIIII
or IIIIIIIII
. At least I have never seen them written that way, so I count them as invalid.
好了,现在回到算法的核心。它是比较函数,当需要打破一个平局时,它有时不得不递归地调用自己。出于这个原因,我们从一些测试开始,看看它是否在递归中到达了终端状态。现在先不考虑这个问题,看看第一个有趣的测试。它检查被比较的数字中是否有一个数字使另一个数字相形见绌。例如,如果其中一个有X,而另一个只有I和V,那么有X的那个就赢了。这依赖于某些罗马数字无效的惯例,如VV或VIIIII或iiiiiiiiiii。至少我从来没有见过他们这样写,所以我认为他们是无效的。
To make this check, we map the digits to magnitudes and compare maximums. Well, this test may not decide the issue. In that case it is safe to compare the first digits of each number, since we won't have to deal with confusing issues like V < IX
where the first digits don't suggest the truth. These confusing situations were taken care of by comparing largest digits.
为了进行检查,我们将数字映射到大小并比较最大值。嗯,这个测试也许不能决定这个问题。在这种情况下,比较每个数字的第一个数字是安全的,因为我们不需要处理像V < IX这样的令人困惑的问题,因为第一个数字不表示事实。这些令人困惑的情况是通过比较最大的数字来处理的。
Finally, if the first digits are equal, strip them off and repeat. At some point one of the numerals will be reduced to an empty string, and those initial tests we were temporarily disregarding will take care of that.
最后,如果第一个数字是相等的,把它们去掉并重复。在某个时刻,其中一个数字将被简化为一个空字符串,而我们暂时忽略的那些初始测试将处理这个问题。
This method has passed all the tests I threw at it, but let me know if you find a bug or optimizations.
这个方法已经通过了我对它的所有测试,但是如果您发现了一个bug或优化,请告诉我。
#4
2
There would seem to be three approaches, namely:
似乎有三种方法,即:
- Convert the numbers, sort using a standard integer sort, and convert back. (Or keep the converted versions with the roman numerals and sort the structures, to avoid the double conversion.)
- 转换这些数字,使用标准整数排序进行排序,然后再进行转换。(或者用罗马数字保留转换后的版本,并对结构进行排序,以避免重复转换。)
- Write a sort function that takes the strings, at that point calls a conversion function and does the appropriate comparison.
- 写一个排序函数,它接受字符串,在那个点调用一个转换函数并进行适当的比较。
- Write a sort function that can compare Roman numerals directly, without necessary involving a full conversion. Since Roman numerals have their higher components first, (Ms then D/Cs. then L/Xs, then I/Vs) such a function might be able to short circuit early.
- 编写一个排序函数,可以直接比较罗马数字,而不需要包含完全转换。因为罗马数字首先有更高的分量,(然后是D/Cs)。然后是L/Xs,然后是I/Vs)这样的函数可能会提前短路。
The first will obviously involve additional overhead for storage. The second will involve additional conversion overhead (since the same number may be converted many times). The third might involve some unnecessary conversion overhead (again, the same number may be converted several times) but save some work on the short circuiting. If storage overheads are not an issue, the first is likely to be the best.
第一种显然会涉及额外的存储开销。第二个将涉及额外的转换开销(因为相同的数字可能被转换多次)。第三种方法可能涉及一些不必要的转换开销(同样,相同的数字可能被多次转换),但在短路时保留一些工作。如果存储管理不是问题,那么第一个可能是最好的。
#5
2
I got quite interested in @borrible's 1st approach, so I decided I will give it a try:
我对@borrible的第一种方法很感兴趣,所以我决定尝试一下:
function sortRomanArray($array) {
$combined=array_combine($array, array_map('roman2int', $array));
asort($combined);
return array_keys($combined);
}
This basically converts all the Roman numerals in the array into integers using array_map()
and a function called roman2int()
(which can be any implementation). Then it creates an array where the keys are the Roman numerals and values are the integers. Then this array is sorted with asort()
that preserves key associations, and the keys are returned as an array. This array will contain the sorted Roman numerals.
它使用array_map()和一个名为roman2int()的函数(可以是任何实现)将数组中的所有罗马数字转换为整数。然后它创建一个数组,其中的键是罗马数字,值是整数。然后,这个数组以保存键关联的asort()排序,并以数组的形式返回键。这个数组将包含排序后的罗马数字。
I like this method because it runs the conversion function only as much times as the size of the array (6 with my example array) and there is no need to convert back.
我喜欢这个方法,因为它运行转换函数的时间只有数组的大小的一倍(对于我的示例数组来说是6倍),并且不需要再进行转换。
The conversion would run certainly much more if we put it in the comparison function (2 times for every comparison).
如果我们把它放到比较函数中(每次比较都是2次),转换就会运行得更多。
#6
1
I think you'll have to either:
我认为你必须:
- Wrap the strings into a RomanNumeral class, that has a sorting method OR
- 将字符串封装到一个RomanNumeral类中,它有一个排序方法。
- Write a method to calculate the value of each element in the array, and sort on that
- 编写一个方法来计算数组中每个元素的值,并对其进行排序
- See if someone has already written a RomanNumeral class/library that does this - something like this
- 看看是否有人已经编写了一个这样的罗曼语类/库
Either way, you'll need custom sorting code that calculates the value somewhere. Since prefixing characters in Roman Numerals can sometimes mean "subtract this value" as opposed to "add this value". This is fine, because as you've pointed out, what you're really doing is sorting by numeric value, so you'll have to tell the computer how to interpret the value.
无论哪种方式,都需要自定义排序代码来计算值。因为在罗马数字中前缀字符有时意味着“减去这个值”而不是“添加这个值”。这很好,因为正如你所指出的,你真正做的是按数值排序,所以你必须告诉计算机如何解释数值。
#7
1
- Convert the numeral to a decimal using this
- 用这个把数字转换成小数
-
Compare the decimals
比较小数
function roman2dec($roman) { // see link above } function compare($a, $b) { return roman2dec($a) < $roman2dec($b) ? -1 : 1; }
#8
0
The simplest solution is probably to first convert each numeral into a regular integer (in a new array), and then sort both arrays based on the integer array. Not sure if PHP contains a function for that, though. Alternatively, you can define a comparison function that converts two Roman numerals to integers and compares them. Writing a function that directly compares two Roman numerals without converting them to integers first will likely be cumbersome.
最简单的解决方案可能是首先将每个数字转换为一个普通的整数(在一个新的数组中),然后基于整数数组对两个数组进行排序。但不确定PHP是否包含这个函数。或者,您可以定义一个比较函数,该函数将两个罗马数字转换为整数并对它们进行比较。编写直接比较两个罗马数字而不首先将它们转换为整数的函数可能会很麻烦。
#9
0
Let's say you make this "alphabet": I, IV, V, IX, X, XL, L, XC, C, CD, D, CM, M. Then you could sort the Roman numbers according to this 'alphabet'.
比方说,你制作了这个“字母表”:I, IV, V, IX, X, XL, L, XC, C, CD, D, CM, m,然后你可以根据这个“字母表”对罗马数字进行排序。
Maybe this will give someone new inspiration.
也许这会给人新的灵感。
EDIT: got a working example. Not really fast, sorts 1000 Roman numbers in 1.3 secs
编辑:有一个工作示例。不是很快,用1。3秒来排序1000个罗马数字
EDIT 2: added a check to avoid the 'notices', also optimized the code a little, runs a little faster, and about twice as fast than with a conversion to integer and than sort that (used PEAR Number_Roman package)
编辑2:添加一个检查以避免“通知”,还对代码进行了少许优化,运行速度稍快,比转换为整数和排序快了大约两倍(使用PEAR Number_Roman包)
function sortromans($a, $b){
$alphabet = array('M', 'CM', 'D', 'CD', 'C', 'XC', 'L', 'XL', 'X', 'IX', 'V', 'IV', 'I');
$pos = 0;
if ($a == $b) {
return 0;
}
//compare the strings, position by position, as long as they are equal
while(isset($a[$pos]) && isset($b[$pos]) && $a[$pos] === $b[$pos]){
$pos++;
}
//if string is shorter than $pos, return value
if(!isset($a[$pos])){
return -1;
} else if(!isset($b[$pos])){
return 1;
} else {
//check the ´character´ at position $pos, and pass the array index to a variable
foreach($alphabet as $i=>$ch){
if(isset($a_index) && isset($b_index)){
break;
}
$length = strlen($ch);
if(!isset($a_index) && substr($a, $pos, $length) === $ch){
$a_index = $i;
}
if(!isset($b_index) && substr($b, $pos, $length) === $ch){
$b_index = $i;
}
}
}
return ($a_index > $b_index) ? -1 : 1;
}
$romans = array('III', 'IX', 'I', 'CM', 'LXII','IV');
usort($romans, "sortromans");
echo "<pre>";
print_r($romans);
echo "</pre>";
#10
0
I think the
best
(see my comment) first solution is to use the standard usort PHP function with the help of a special roman compare function.
我认为最好的(请参阅我的评论)第一种解决方案是在一个特殊的罗马比较函数的帮助下使用标准的usort PHP函数。
The following roman_compare function is very intuitive and do not use any kind of conversion. To keep it simple, it uses tail recursion.
下面的roman_compare函数非常直观,不使用任何类型的转换。为了保持简单,它使用尾部递归。
function roman_start( $a )
{
static $romans = array(
'I' => 1, 'V' => 5,
'X' => 10, 'L' => 50,
'C' => 100, 'D' => 500,
'M' => 1000,
);
return $a[0] . ($romans[$a[0]] < $romans[$a[1]] ? $a[1] : '');
}
function roman_compare( $a, $b )
{
static $romans = array(
'I' => 1, 'IV' => 4, 'V' => 5, 'IX' => 9,
'X' => 10, 'XL' => 40, 'L' => 50, 'XC' => 90,
'C' => 100, 'CD' => 400, 'D' => 500, 'CM' => 900,
'M' => 1000,
);
$blockA = roman_start($a);
$blockB = roman_start($b);
if ($blockA != $blockB)
{
return $romans[$blockA] - $romans[$blockB];
}
$compared = strlen($blockA);
if (strlen($a) == $compared) //string ended
{
return 0;
}
return roman_compare(substr($a, $compared), substr($b, $compared));
}
Using the above functions, we can write
使用上面的函数,我们可以写。
function array_equal( $a, $b )
{
return count(array_diff_assoc($a, $b)) == 0 && count(array_diff_assoc($b, $a)) == 0;
}
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
$sorted_a = array('III', 'V', 'XIII', 'XIX', 'LII', 'MCCXCIV');
var_dump(array_equal($sorted_a, $a));
usort($a, 'roman_compare');
var_dump(array_equal($sorted_a, $a));
Running all the above code we get
运行上面的所有代码。
bool(false)
bool(true)
#1
26
Picking your class to convert roman numbers to integers, a user-defined sort callback can handle this to sort the array:
选择您的类来将罗马数字转换为整数,用户定义的排序回调可以处理它来对数组进行排序:
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
$bool = usort($a, function($a, $b) {
return RomanNumber::Roman2Int($a) - RomanNumber::Roman2Int($b);
});
var_dump($a);
So here you find the logic inside the comparison function: if both values are of the same weight, return 0
. If the first is lower than the second, return < 0
(e.g. -1
), otherwise the second is larger than the first so return > 0
(e.g. 1
).
这里你可以找到比较函数的逻辑:如果两个值的权重相同,返回0。如果第一个小于第二个,返回< 0(如-1),否则第二个大于第一个,因此返回>(如1)。
Naturally any other type of function that returns the decimal value for a roman number would work as well.
当然,任何其他类型的函数返回一个罗马数字的十进制值也会起作用。
Edit:
编辑:
As you commented, you do not want to run the conversion for each pair. That's fine, with a help of an additional array which contains all converted values, you can run the sort on the decimal values and use that sorting on the roman numbers as well (Demo):
正如您所评论的,您不希望为每一对运行转换。没关系,在包含所有转换值的附加数组的帮助下,您可以在十进制值上运行排序,并在罗马数字上使用排序(Demo):
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
$b = array_map('RomanNumber::Roman2Int', $a);
array_multisort($b, $a);
var_dump($a);
array_multisort
PHP Manual does most of the magic here.
array_multisort PHP手册在这里完成了大部分神奇的工作。
#2
10
function sortRomanNum($a, $b) {
if($a == $b) return 0;
$str = "0IVXLCDM";
$len = 0;
if(strlen($a) >= strlen($b)) {
$len = strlen($a);
$b .= str_repeat("0", $len - strlen($b));
}
else {
$len = strlen($b);
$a .= str_repeat("0", $len - strlen($a));
}
for($i = 0; $i < $len - 1; $i++) {
$a1 = $a[$i]; $b1 = $b[$i]; $a2 = $a[$i+1]; $b2 = $b[$i+1];
if( strpos($str, $a1.$b1.$a2) !== false ) return 1;
if( strpos($str, $b1.$a1.$b2) !== false ) return -1;
if($a1 != $b1) return strpos($str, $a1) > strpos($str, $b1) ? 1 : -1;
}
if($a[$i] != $b[$i]) return strpos($str, $a[$i]) > strpos($str, $b[$i]) ? 1 : -1;
}
Given two numbers (roman strings), $a and $b. If there are no substractions in the numbers (IV, IX, XC etc), then the solution would be trivial:
给定两个数字(罗马字符串),$a和$b。如果在数(IV、IX、XC等)中没有减法,则解是平凡的:
for all $i in $a and $b
if $a[$i] > $b[$i] then return 1; //($a is greater then $b)
if $a[$i] < $b[$i] then return 1; //($a is lower then $b)
return 0 //equality
Since there can be these special parts, the calculation is more complex. But the solution is to find the patterns:
因为可以有这些特殊的部分,所以计算起来比较复杂。但解决方法是找出模式:
a: IX | XC | CM
b: V | L | D
These are the only patterns which can mess up the trivial solution. If you find any of these, then $a will be greater then $b.
这些是唯一可以打乱琐碎解决方案的模式。如果你找到其中的任何一个,那么a比b大。
Note, that roman numbers don't include zeros, like the arabic ones. Therefore now we will use them (and basically put zeros where they are missing).
注意,罗马数字不包括0,比如阿拉伯数字。因此,现在我们将使用它们(基本上把0放在它们丢失的地方)。
So here comes the function:
函数是这样的:
if $a == $b then return 0; //equality
create a string for ordering the roman numerals (strpos will give the right index)
define the length of the loop (take the longer string), and add zeros to the end of the shorter number
run the loop, and check:
1. if the patterns above are found, return the comparision accordingly (1 or -1)
2. otherwise do the trivial check (compare each numeral)
check the last numerals too.
#3
4
Some people have suggested converting Roman numerals to integers, sorting, and mapping back. There is an easier way. All that we really need to do is compare any two arbitrary Roman numerals and let usort
do the rest. Here is the code, and I will explain its design below.
有些人建议把罗马数字转换成整数、排序和映射。有一个更简单的方法。我们需要做的就是比较任意两个罗马数字剩下的部分。这是代码,我将在下面解释它的设计。
$base = array( 'I' => 0, 'V' => 1, 'X' => 2, 'L' => 3,
'C' => 4, 'D' => 5, 'M' => 6 );
function single($a) { global $base; return $base[$a]; }
function compare($a, $b) {
global $base;
if(strlen($a) == 0) { return true; }
if(strlen($b) == 0) { return false; }
$maxa = max(array_map('single', str_split($a)));
$maxb = max(array_map('single', str_split($b)));
if($maxa != $maxb) {
return $maxa < $maxb;
}
if($base[$a[0]] != $base[$b[0]]) {
return $base[$a[0]] < $base[$b[0]];
}
return compare(substr($a, 1), substr($b, 1));
}
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
usort($a, compare);
print_r($a);
First we create a lookup array to assign a "magnitude" to single digit Roman numerals. Notice this isn't their decimal value, just numbers assigned in such a way that bigger numerals get bigger values. Then we create a helper function single
used by some PHP functions to to retrieve the magnitudes.
首先,我们创建一个查找数组,将“大小”分配给单个数字的罗马数字。注意,这不是它们的十进制值,只是以这样的方式分配的数字,更大的数字会得到更大的值。然后我们创建一个帮助函数,由一些PHP函数使用它来检索大小。
OK, now to the meat of the algorithm. It is the compare
function which sometimes has to call itself recursively when it needs to break a tie. For this reason, we start with some tests to see if it has reached terminal states in the recursion. Disregard that for now and look at the first interesting test. It checks to see if either numeral being compared has a digit in it that dwarfs any digits of the other. For instance, if one of them has X
in it, and the other only has I
and V
, then the one with X
wins. This relies on the convention that certain Roman numerals are not valid, like VV
or VIIIII
or IIIIIIIII
. At least I have never seen them written that way, so I count them as invalid.
好了,现在回到算法的核心。它是比较函数,当需要打破一个平局时,它有时不得不递归地调用自己。出于这个原因,我们从一些测试开始,看看它是否在递归中到达了终端状态。现在先不考虑这个问题,看看第一个有趣的测试。它检查被比较的数字中是否有一个数字使另一个数字相形见绌。例如,如果其中一个有X,而另一个只有I和V,那么有X的那个就赢了。这依赖于某些罗马数字无效的惯例,如VV或VIIIII或iiiiiiiiiii。至少我从来没有见过他们这样写,所以我认为他们是无效的。
To make this check, we map the digits to magnitudes and compare maximums. Well, this test may not decide the issue. In that case it is safe to compare the first digits of each number, since we won't have to deal with confusing issues like V < IX
where the first digits don't suggest the truth. These confusing situations were taken care of by comparing largest digits.
为了进行检查,我们将数字映射到大小并比较最大值。嗯,这个测试也许不能决定这个问题。在这种情况下,比较每个数字的第一个数字是安全的,因为我们不需要处理像V < IX这样的令人困惑的问题,因为第一个数字不表示事实。这些令人困惑的情况是通过比较最大的数字来处理的。
Finally, if the first digits are equal, strip them off and repeat. At some point one of the numerals will be reduced to an empty string, and those initial tests we were temporarily disregarding will take care of that.
最后,如果第一个数字是相等的,把它们去掉并重复。在某个时刻,其中一个数字将被简化为一个空字符串,而我们暂时忽略的那些初始测试将处理这个问题。
This method has passed all the tests I threw at it, but let me know if you find a bug or optimizations.
这个方法已经通过了我对它的所有测试,但是如果您发现了一个bug或优化,请告诉我。
#4
2
There would seem to be three approaches, namely:
似乎有三种方法,即:
- Convert the numbers, sort using a standard integer sort, and convert back. (Or keep the converted versions with the roman numerals and sort the structures, to avoid the double conversion.)
- 转换这些数字,使用标准整数排序进行排序,然后再进行转换。(或者用罗马数字保留转换后的版本,并对结构进行排序,以避免重复转换。)
- Write a sort function that takes the strings, at that point calls a conversion function and does the appropriate comparison.
- 写一个排序函数,它接受字符串,在那个点调用一个转换函数并进行适当的比较。
- Write a sort function that can compare Roman numerals directly, without necessary involving a full conversion. Since Roman numerals have their higher components first, (Ms then D/Cs. then L/Xs, then I/Vs) such a function might be able to short circuit early.
- 编写一个排序函数,可以直接比较罗马数字,而不需要包含完全转换。因为罗马数字首先有更高的分量,(然后是D/Cs)。然后是L/Xs,然后是I/Vs)这样的函数可能会提前短路。
The first will obviously involve additional overhead for storage. The second will involve additional conversion overhead (since the same number may be converted many times). The third might involve some unnecessary conversion overhead (again, the same number may be converted several times) but save some work on the short circuiting. If storage overheads are not an issue, the first is likely to be the best.
第一种显然会涉及额外的存储开销。第二个将涉及额外的转换开销(因为相同的数字可能被转换多次)。第三种方法可能涉及一些不必要的转换开销(同样,相同的数字可能被多次转换),但在短路时保留一些工作。如果存储管理不是问题,那么第一个可能是最好的。
#5
2
I got quite interested in @borrible's 1st approach, so I decided I will give it a try:
我对@borrible的第一种方法很感兴趣,所以我决定尝试一下:
function sortRomanArray($array) {
$combined=array_combine($array, array_map('roman2int', $array));
asort($combined);
return array_keys($combined);
}
This basically converts all the Roman numerals in the array into integers using array_map()
and a function called roman2int()
(which can be any implementation). Then it creates an array where the keys are the Roman numerals and values are the integers. Then this array is sorted with asort()
that preserves key associations, and the keys are returned as an array. This array will contain the sorted Roman numerals.
它使用array_map()和一个名为roman2int()的函数(可以是任何实现)将数组中的所有罗马数字转换为整数。然后它创建一个数组,其中的键是罗马数字,值是整数。然后,这个数组以保存键关联的asort()排序,并以数组的形式返回键。这个数组将包含排序后的罗马数字。
I like this method because it runs the conversion function only as much times as the size of the array (6 with my example array) and there is no need to convert back.
我喜欢这个方法,因为它运行转换函数的时间只有数组的大小的一倍(对于我的示例数组来说是6倍),并且不需要再进行转换。
The conversion would run certainly much more if we put it in the comparison function (2 times for every comparison).
如果我们把它放到比较函数中(每次比较都是2次),转换就会运行得更多。
#6
1
I think you'll have to either:
我认为你必须:
- Wrap the strings into a RomanNumeral class, that has a sorting method OR
- 将字符串封装到一个RomanNumeral类中,它有一个排序方法。
- Write a method to calculate the value of each element in the array, and sort on that
- 编写一个方法来计算数组中每个元素的值,并对其进行排序
- See if someone has already written a RomanNumeral class/library that does this - something like this
- 看看是否有人已经编写了一个这样的罗曼语类/库
Either way, you'll need custom sorting code that calculates the value somewhere. Since prefixing characters in Roman Numerals can sometimes mean "subtract this value" as opposed to "add this value". This is fine, because as you've pointed out, what you're really doing is sorting by numeric value, so you'll have to tell the computer how to interpret the value.
无论哪种方式,都需要自定义排序代码来计算值。因为在罗马数字中前缀字符有时意味着“减去这个值”而不是“添加这个值”。这很好,因为正如你所指出的,你真正做的是按数值排序,所以你必须告诉计算机如何解释数值。
#7
1
- Convert the numeral to a decimal using this
- 用这个把数字转换成小数
-
Compare the decimals
比较小数
function roman2dec($roman) { // see link above } function compare($a, $b) { return roman2dec($a) < $roman2dec($b) ? -1 : 1; }
#8
0
The simplest solution is probably to first convert each numeral into a regular integer (in a new array), and then sort both arrays based on the integer array. Not sure if PHP contains a function for that, though. Alternatively, you can define a comparison function that converts two Roman numerals to integers and compares them. Writing a function that directly compares two Roman numerals without converting them to integers first will likely be cumbersome.
最简单的解决方案可能是首先将每个数字转换为一个普通的整数(在一个新的数组中),然后基于整数数组对两个数组进行排序。但不确定PHP是否包含这个函数。或者,您可以定义一个比较函数,该函数将两个罗马数字转换为整数并对它们进行比较。编写直接比较两个罗马数字而不首先将它们转换为整数的函数可能会很麻烦。
#9
0
Let's say you make this "alphabet": I, IV, V, IX, X, XL, L, XC, C, CD, D, CM, M. Then you could sort the Roman numbers according to this 'alphabet'.
比方说,你制作了这个“字母表”:I, IV, V, IX, X, XL, L, XC, C, CD, D, CM, m,然后你可以根据这个“字母表”对罗马数字进行排序。
Maybe this will give someone new inspiration.
也许这会给人新的灵感。
EDIT: got a working example. Not really fast, sorts 1000 Roman numbers in 1.3 secs
编辑:有一个工作示例。不是很快,用1。3秒来排序1000个罗马数字
EDIT 2: added a check to avoid the 'notices', also optimized the code a little, runs a little faster, and about twice as fast than with a conversion to integer and than sort that (used PEAR Number_Roman package)
编辑2:添加一个检查以避免“通知”,还对代码进行了少许优化,运行速度稍快,比转换为整数和排序快了大约两倍(使用PEAR Number_Roman包)
function sortromans($a, $b){
$alphabet = array('M', 'CM', 'D', 'CD', 'C', 'XC', 'L', 'XL', 'X', 'IX', 'V', 'IV', 'I');
$pos = 0;
if ($a == $b) {
return 0;
}
//compare the strings, position by position, as long as they are equal
while(isset($a[$pos]) && isset($b[$pos]) && $a[$pos] === $b[$pos]){
$pos++;
}
//if string is shorter than $pos, return value
if(!isset($a[$pos])){
return -1;
} else if(!isset($b[$pos])){
return 1;
} else {
//check the ´character´ at position $pos, and pass the array index to a variable
foreach($alphabet as $i=>$ch){
if(isset($a_index) && isset($b_index)){
break;
}
$length = strlen($ch);
if(!isset($a_index) && substr($a, $pos, $length) === $ch){
$a_index = $i;
}
if(!isset($b_index) && substr($b, $pos, $length) === $ch){
$b_index = $i;
}
}
}
return ($a_index > $b_index) ? -1 : 1;
}
$romans = array('III', 'IX', 'I', 'CM', 'LXII','IV');
usort($romans, "sortromans");
echo "<pre>";
print_r($romans);
echo "</pre>";
#10
0
I think the
best
(see my comment) first solution is to use the standard usort PHP function with the help of a special roman compare function.
我认为最好的(请参阅我的评论)第一种解决方案是在一个特殊的罗马比较函数的帮助下使用标准的usort PHP函数。
The following roman_compare function is very intuitive and do not use any kind of conversion. To keep it simple, it uses tail recursion.
下面的roman_compare函数非常直观,不使用任何类型的转换。为了保持简单,它使用尾部递归。
function roman_start( $a )
{
static $romans = array(
'I' => 1, 'V' => 5,
'X' => 10, 'L' => 50,
'C' => 100, 'D' => 500,
'M' => 1000,
);
return $a[0] . ($romans[$a[0]] < $romans[$a[1]] ? $a[1] : '');
}
function roman_compare( $a, $b )
{
static $romans = array(
'I' => 1, 'IV' => 4, 'V' => 5, 'IX' => 9,
'X' => 10, 'XL' => 40, 'L' => 50, 'XC' => 90,
'C' => 100, 'CD' => 400, 'D' => 500, 'CM' => 900,
'M' => 1000,
);
$blockA = roman_start($a);
$blockB = roman_start($b);
if ($blockA != $blockB)
{
return $romans[$blockA] - $romans[$blockB];
}
$compared = strlen($blockA);
if (strlen($a) == $compared) //string ended
{
return 0;
}
return roman_compare(substr($a, $compared), substr($b, $compared));
}
Using the above functions, we can write
使用上面的函数,我们可以写。
function array_equal( $a, $b )
{
return count(array_diff_assoc($a, $b)) == 0 && count(array_diff_assoc($b, $a)) == 0;
}
$a = array('XIX', 'LII', 'V', 'MCCXCIV', 'III', 'XIII');
$sorted_a = array('III', 'V', 'XIII', 'XIX', 'LII', 'MCCXCIV');
var_dump(array_equal($sorted_a, $a));
usort($a, 'roman_compare');
var_dump(array_equal($sorted_a, $a));
Running all the above code we get
运行上面的所有代码。
bool(false)
bool(true)