有没有办法在纯PHP中检测循环数组?

时间:2021-07-11 16:56:41

I'm trying to implement my own serialization / var_dump style function in PHP. It seems impossible if there is the possibility of circular arrays (which there is).

我正在尝试在PHP中实现我自己的序列化/ var_dump样式函数。如果存在圆形阵列(有)的可能性似乎是不可能的。

In recent PHP versions, var_dump seems to detect circular arrays:

在最近的PHP版本中,var_dump似乎检测到循环数组:

php > $a = array();
php > $a[] = &$a;
php > var_dump($a);
array(1) {
  [0]=>
  &array(1) {
    [0]=>
    *RECURSION*
  }
}

How would I implement my own serialization type of method in PHP that can detect similarly? I can't just keep track of which arrays I've visited, because strict comparison of arrays in PHP returns true for different arrays that contain the same elements and comparing circular arrays causes a Fatal Error, anyways.

我如何在PHP中实现我自己的序列化类型的方法,可以类似地检测?我不能只跟踪我访问过哪些数组,因为PHP中的数组的严格比较对于包含相同元素的不同数组返回true,并且比较循环数组会导致致命错误。

php > $b = array(1,2);
php > $c = array(1,2);
php > var_dump($b === $c);
bool(true)
php > $a = array();
php > $a[] = &$a;
php > var_dump($a === $a);
PHP Fatal error:  Nesting level too deep - recursive dependency? in php shell code on line 1

I've looked for a way to find a unique id (pointer) for an array, but I can't find one. spl_object_hash only works on objects, not arrays. If I cast multiple different arrays to objects they all get the same spl_object_hash value (why?).

我找了一种方法来找到一个数组的唯一id(指针),但我找不到一个。 spl_object_hash仅适用于对象,而不适用于数组。如果我将多个不同的数组转换为对象,它们都会获得相同的spl_object_hash值(为什么?)。

EDIT:

编辑:

Calling print_r, var_dump, or serialize on each array and then using some mechanism to detect the presence of recursion as detected by those methods is an algorithmic complexity nightmare and will basically render any use too slow to be practical on large nested arrays.

在每个数组上调用print_r,var_dump或serialize,然后使用某种机制来检测这些方法检测到的递归的存在是算法复杂性的噩梦,并且基本上会使任何使用太慢而无法在大型嵌套数组上实用。

ACCEPTED ANSWER:

接受的答案:

I accepted the answer below that was the first to suggest temporarily altering the an array to see if it is indeed the same as another array. That answers the "how do I compare two arrays for identity?" from which recursion detection is trivial.

我接受了下面的答案,这是第一个建议暂时改变一个数组以查看它是否确实与另一个数组相同的答案。这回答了“我如何比较两个阵列的身份?”递归检测是微不足道的。

4 个解决方案

#1


4  

The isRecursiveArray(array) method below detects circular/recursive arrays. It keeps track of which arrays have been visited by temporarily adding an element containing a known object reference to the end of the array.

下面的isRecursiveArray(array)方法检测循环/递归数组。它通过临时将包含已知对象引用的元素添加到数组末尾来跟踪已访问的数组。

If you want help writing the serialization method, please update your topic question and provide a sample serialization format in your question.

如果您需要帮助编写序列化方法,请更新您的主题问题并在您的问题中提供示例序列化格式。

function removeLastElementIfSame(array & $array, $reference) {
    if(end($array) === $reference) {
        unset($array[key($array)]);
    }
}

function isRecursiveArrayIteration(array & $array, $reference) {
    $last_element   = end($array);
    if($reference === $last_element) {
        return true;
    }
    $array[]    = $reference;

    foreach($array as &$element) {
        if(is_array($element)) {
            if(isRecursiveArrayIteration($element, $reference)) {
                removeLastElementIfSame($array, $reference);
                return true;
            }
        }
    }

    removeLastElementIfSame($array, $reference);

    return false;
}

function isRecursiveArray(array $array) {
    $some_reference = new stdclass();
    return isRecursiveArrayIteration($array, $some_reference);
}



$array      = array('a','b','c');
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = $array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
$array      = array($array);
var_dump(isRecursiveArray($array));
print_r($array);

#2


0  

Funny method (I know it is stupid :)), but you can modify it and track the "path" to the recursive element. This is just an idea :) Based on the property of the serialized string, when recursion starts in will be the same as the string for the original array. As you can see - I tried it on many different variations and might be something is able to 'fool' it, but it 'detects' all listed recursions. And I did not try recursive arrays with objects.

有趣的方法(我知道它是愚蠢:)),但你可以修改它并跟踪递归元素的“路径”。这只是一个想法:)基于序列化字符串的属性,当递归开始时将与原始数组的字符串相同。正如你所看到的 - 我尝试了许多不同的变体,可能是某些东西能够'愚弄'它,但它'检测'所有列出的递归。我没有尝试使用对象的递归数组。

$a = array('b1'=>'a1','b2'=>'a2','b4'=>'a3','b5'=>'R:1;}}}');
$a['a1'] = &$a;
$a['b6'] = &$a;
$a['b6'][] = array(1,2,&$a);
$b = serialize($a); 
print_r($a);
function WalkArrayRecursive(&$array_name, &$temp){
    if (is_array($array_name)){
        foreach ($array_name as $k => &$v){
           if (is_array($v)){
                if (strpos($temp, preg_replace('#R:\d+;\}+$#', '', 
                               serialize($v)))===0) 
                { 
                  echo "\n Recursion detected at " . $k ."\n"; 
                  continue; 
                }
                WalkArrayRecursive($v, $temp);
            }
        }
    }
}
WalkArrayRecursive($a, $b);

regexp is for the situation when element with recursion is at the 'end' of the array. and, yes, this recursion is related to the whole array. It is possible to make recursion of the subelements, but it is too late for me to think about them. Somehow every element of the array should be checked for the recursion in its subelements. The same way, like above, through the output of the print_r function, or looking for specific record for recursion in serialized string (R:4;} something like this). And tracing should start from that element, comparing everything below by my script. All that is only if you want to detect where recursion starts, not just whether you have it or not.

regexp适用于具有递归的元素位于数组“末尾”的情况。并且,是的,这个递归与整个数组有关。可以对子元素进行递归,但是对于我来说,考虑它们为时已晚。不知何故,应该检查数组的每个元素的子元素中的递归。同样的方式,如上所述,通过print_r函数的输出,或者查找序列化字符串中递归的特定记录(R:4;}这样的事情)。跟踪应该从该元素开始,通过我的脚本比较下面的所有内容。所有这一切只有你想要检测递归开始的地方,而不仅仅是你是否拥有它。

ps: but the best thing should be, as I think, to write your own unserialize function from serailized string created by php itself.

ps:但是我认为最好的事情应该是从php本身创建的serailized string编写自己的unserialize函数。

#3


0  

My approach is to have a temp array that holds a copy of all objects that were already iterated. like this here:

我的方法是有一个临时数组,它包含已经迭代的所有对象的副本。像这样:

// We use this to detect recursion.
global $recursion;
$recursion = [];

function dump( $data, $label, $level = 0 ) {
    global $recursion;

    // Some nice output for debugging/testing...
    echo "\n";
    echo str_repeat( "  ", $level );
    echo $label . " (" . gettype( $data ) . ") ";

    // -- start of our recursion detection logic
    if ( is_object( $data ) ) {
        foreach ( $recursion as $done ) {
            if ( $done === $data ) {
                echo "*RECURSION*";
                return;
            }
        }

        // This is the key-line: Remember that we processed this item!
        $recursion[] = $data;
    }
    // -- end of recursion check

    if ( is_array( $data ) || is_object( $data ) ) {
        foreach ( (array) $data as $key => $item ) {
            dump( $item, $key, $level + 1 );
        }
    } else {
        echo "= " . $data;
    }
}

And here is some quick demo code to illustrate how it works:

这里有一些快速演示代码来说明它是如何工作的:

$obj = new StdClass();
$obj->arr = [];
$obj->arr[] = 'Foo';
$obj->arr[] = $obj;
$obj->arr[] = 'Bar';
$obj->final = 12345;
$obj->a2 = $obj->arr;

dump( $obj, 'obj' );

This script will generate the following output:

该脚本将生成以下输出:

obj (object) 
  arr (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar
  final (integer) = 12345
  a2 (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar

#4


-1  

It's not elegant, but solves your problem (at least if you dont have someone using *RECURSION* as a value).

它不优雅,但解决了你的问题(至少如果你没有人使用* RECURSION *作为一个值)。

<?php
$a[] = &$a;
if(strpos(print_r($a,1),'*RECURSION*') !== FALSE) echo 1;

#1


4  

The isRecursiveArray(array) method below detects circular/recursive arrays. It keeps track of which arrays have been visited by temporarily adding an element containing a known object reference to the end of the array.

下面的isRecursiveArray(array)方法检测循环/递归数组。它通过临时将包含已知对象引用的元素添加到数组末尾来跟踪已访问的数组。

If you want help writing the serialization method, please update your topic question and provide a sample serialization format in your question.

如果您需要帮助编写序列化方法,请更新您的主题问题并在您的问题中提供示例序列化格式。

function removeLastElementIfSame(array & $array, $reference) {
    if(end($array) === $reference) {
        unset($array[key($array)]);
    }
}

function isRecursiveArrayIteration(array & $array, $reference) {
    $last_element   = end($array);
    if($reference === $last_element) {
        return true;
    }
    $array[]    = $reference;

    foreach($array as &$element) {
        if(is_array($element)) {
            if(isRecursiveArrayIteration($element, $reference)) {
                removeLastElementIfSame($array, $reference);
                return true;
            }
        }
    }

    removeLastElementIfSame($array, $reference);

    return false;
}

function isRecursiveArray(array $array) {
    $some_reference = new stdclass();
    return isRecursiveArrayIteration($array, $some_reference);
}



$array      = array('a','b','c');
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = $array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
var_dump(isRecursiveArray($array));
print_r($array);



$array      = array('a','b','c');
$array[]    = &$array;
$array      = array($array);
var_dump(isRecursiveArray($array));
print_r($array);

#2


0  

Funny method (I know it is stupid :)), but you can modify it and track the "path" to the recursive element. This is just an idea :) Based on the property of the serialized string, when recursion starts in will be the same as the string for the original array. As you can see - I tried it on many different variations and might be something is able to 'fool' it, but it 'detects' all listed recursions. And I did not try recursive arrays with objects.

有趣的方法(我知道它是愚蠢:)),但你可以修改它并跟踪递归元素的“路径”。这只是一个想法:)基于序列化字符串的属性,当递归开始时将与原始数组的字符串相同。正如你所看到的 - 我尝试了许多不同的变体,可能是某些东西能够'愚弄'它,但它'检测'所有列出的递归。我没有尝试使用对象的递归数组。

$a = array('b1'=>'a1','b2'=>'a2','b4'=>'a3','b5'=>'R:1;}}}');
$a['a1'] = &$a;
$a['b6'] = &$a;
$a['b6'][] = array(1,2,&$a);
$b = serialize($a); 
print_r($a);
function WalkArrayRecursive(&$array_name, &$temp){
    if (is_array($array_name)){
        foreach ($array_name as $k => &$v){
           if (is_array($v)){
                if (strpos($temp, preg_replace('#R:\d+;\}+$#', '', 
                               serialize($v)))===0) 
                { 
                  echo "\n Recursion detected at " . $k ."\n"; 
                  continue; 
                }
                WalkArrayRecursive($v, $temp);
            }
        }
    }
}
WalkArrayRecursive($a, $b);

regexp is for the situation when element with recursion is at the 'end' of the array. and, yes, this recursion is related to the whole array. It is possible to make recursion of the subelements, but it is too late for me to think about them. Somehow every element of the array should be checked for the recursion in its subelements. The same way, like above, through the output of the print_r function, or looking for specific record for recursion in serialized string (R:4;} something like this). And tracing should start from that element, comparing everything below by my script. All that is only if you want to detect where recursion starts, not just whether you have it or not.

regexp适用于具有递归的元素位于数组“末尾”的情况。并且,是的,这个递归与整个数组有关。可以对子元素进行递归,但是对于我来说,考虑它们为时已晚。不知何故,应该检查数组的每个元素的子元素中的递归。同样的方式,如上所述,通过print_r函数的输出,或者查找序列化字符串中递归的特定记录(R:4;}这样的事情)。跟踪应该从该元素开始,通过我的脚本比较下面的所有内容。所有这一切只有你想要检测递归开始的地方,而不仅仅是你是否拥有它。

ps: but the best thing should be, as I think, to write your own unserialize function from serailized string created by php itself.

ps:但是我认为最好的事情应该是从php本身创建的serailized string编写自己的unserialize函数。

#3


0  

My approach is to have a temp array that holds a copy of all objects that were already iterated. like this here:

我的方法是有一个临时数组,它包含已经迭代的所有对象的副本。像这样:

// We use this to detect recursion.
global $recursion;
$recursion = [];

function dump( $data, $label, $level = 0 ) {
    global $recursion;

    // Some nice output for debugging/testing...
    echo "\n";
    echo str_repeat( "  ", $level );
    echo $label . " (" . gettype( $data ) . ") ";

    // -- start of our recursion detection logic
    if ( is_object( $data ) ) {
        foreach ( $recursion as $done ) {
            if ( $done === $data ) {
                echo "*RECURSION*";
                return;
            }
        }

        // This is the key-line: Remember that we processed this item!
        $recursion[] = $data;
    }
    // -- end of recursion check

    if ( is_array( $data ) || is_object( $data ) ) {
        foreach ( (array) $data as $key => $item ) {
            dump( $item, $key, $level + 1 );
        }
    } else {
        echo "= " . $data;
    }
}

And here is some quick demo code to illustrate how it works:

这里有一些快速演示代码来说明它是如何工作的:

$obj = new StdClass();
$obj->arr = [];
$obj->arr[] = 'Foo';
$obj->arr[] = $obj;
$obj->arr[] = 'Bar';
$obj->final = 12345;
$obj->a2 = $obj->arr;

dump( $obj, 'obj' );

This script will generate the following output:

该脚本将生成以下输出:

obj (object) 
  arr (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar
  final (integer) = 12345
  a2 (array) 
    0 (string) = Foo
    1 (object) *RECURSION*
    2 (string) = Bar

#4


-1  

It's not elegant, but solves your problem (at least if you dont have someone using *RECURSION* as a value).

它不优雅,但解决了你的问题(至少如果你没有人使用* RECURSION *作为一个值)。

<?php
$a[] = &$a;
if(strpos(print_r($a,1),'*RECURSION*') !== FALSE) echo 1;