json_encode对象(使用yield)

时间:2021-10-01 23:32:23

I have a very large array in PHP (5.6), generated dynamically, which I want to convert to JSON. The problem is that the array is too large that it doesn't fit in memory - I get a fatal error when I try to process it (exhausted memory). So I figured out that, using generators, the memory problem will disappear.

我有一个很大的PHP数组(5.6),它是动态生成的,我想将它转换成JSON。问题是数组太大了,无法装入内存——当我试图处理它时(耗尽内存),会出现一个致命的错误。所以我发现,使用生成器,内存问题就会消失。

This is the code I've tried so far (this reduced example obvisously doesn't produce the memory error):

这是我到目前为止尝试过的代码(这个减少的示例并不会产生内存错误):

<?php 
function arrayGenerator()// new way using generators
{
    for ($i = 0; $i < 100; $i++) {
        yield $i;
    }
}

function getArray()// old way, generating and returning the full array
{
    $array = [];
    for ($i = 0; $i < 100; $i++) {
        $array[] = $i;
    }
    return $array;
}

$object = [
    'id' => 'foo',
    'type' => 'blah',
    'data' => getArray(),
    'gen'  => arrayGenerator(),
];

echo json_encode($object);

But PHP seems to not JSON-encode the values from the generator. This is the output I get from the previuos script:

但是PHP似乎没有对生成器中的值进行json编码。这是我从previuos脚本得到的输出:

{
    "id": "foo",
    "type": "blah",
    "data": [// old way - OK
        0,
        1,
        2,
        3,
        //...
    ],
    "gen": {}// using generator - empty object!
}

Is it even possible to JSON-encode an array produced by a generator without generating the full sequence before I call to json_encode?

在调用json_encode之前,是否可能在不生成完整序列的情况下对生成器生成的数组进行json编码?

2 个解决方案

#1


4  

Unfortunately, json_encode cannot generate a result from a generator function. Using iterator_to_array will still try to create the whole array, which will still cause memory issues.

不幸的是,json_encode不能从生成器函数生成结果。使用iterator_to_array仍然尝试创建整个数组,这仍然会导致内存问题。

You will need to create your function that will generate the json string from the generator function. Here's an example of how that could look:

您将需要创建将从生成器函数生成json字符串的函数。这里有一个例子:

function json_encode_generator(callable $generator) {
    $result = '[';

    foreach ($generator as $value) {
        $result .= json_encode($value) . ',';
    }

    return trim($result, ',') . ']';
}

Instead of encoding the whole array at once, it encodes only one object at a time and concatenates the results into one string.

它不是一次编码整个数组,而是一次只编码一个对象,并将结果连接到一个字符串中。

The above example only takes care of encoding an array, but it can be easily extended to recursively encoding whole objects.

上面的示例只负责对数组进行编码,但是可以很容易地扩展到递归地编码整个对象。

If the created string is still too big to fit in the memory, then your only remaining option is to directly use an output stream. Here's how that could look:

如果创建的字符串仍然太大,无法装入内存,那么惟一的选项就是直接使用输出流。这看起来是这样的:

function json_encode_generator(callable $generator, $outputStream) {
    fwrite($outputStream, '[');

    foreach ($generator as $key => $value) {
        if ($key != 0) {
            fwrite($outputStream, ','); 
        }

        fwrite($outputStream, json_encode($value));
    }

    fwrite($outputStream, ']');
}

As you can see, the only difference is that we now use fwrite to write to the passed in stream instead of concatenating strings, and we also need to take care of the trailing comma in a different way.

正如您所看到的,惟一的区别是,我们现在使用fwrite来写入到流中传递而不是连接字符串,我们还需要以不同的方式处理后面的逗号。

#2


1  

What is a generator function?

A generator function is effectively a more compact and efficient way to write an Iterator. It allows you to define a function that will calculate and return values while you are looping over it:

生成器函数实际上是编写迭代器的一种更紧凑、更有效的方法。它允许您定义一个函数,当您循环使用它时,它将计算并返回值:

Also as per document from http://php.net/manual/en/language.generators.overview.php

也可以按照http://php.net/manual/en/language.generators.overview.php中的文档

Generators provide an easy way to implement simple iterators without the overhead or complexity of implementing a class that implements the Iterator interface.

生成器提供了一种简单的方法来实现简单的迭代器,而无需实现实现迭代器接口的类的开销或复杂性。

A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate. Instead, you can write a generator function, which is the same as a normal function, except that instead of returning once, a generator can yield as many times as it needs to in order to provide the values to be iterated over.

生成器允许您编写使用foreach迭代一组数据的代码,而不需要在内存中构建数组,这可能会导致您超出内存限制,或者需要大量的处理时间来生成。相反,您可以编写一个生成器函数,它与普通函数相同,只是生成器可以生成所需的次数,以提供要迭代的值,而不是只返回一次。

What is yield?

The yield keyword returns data from a generator function:

yield关键字返回生成器函数的数据:

The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that instead of stopping execution of the function and returning, yield instead provides a value to the code looping over the generator and pauses execution of the generator function.

生成器函数的核心是yield关键字。在最简单的形式中,yield语句看起来很像return语句,只不过它没有停止函数的执行并返回,而是提供了一个值给在生成器上循环的代码,并暂停了generator函数的执行。

So in your case to generate expected output you need to iterate output of arrayGenerator() function by using foreach loop or iterator before processind it to json (as suggested by @apokryfos)

因此,在生成预期输出时,需要使用foreach循环或iterator来迭代arrayGenerator()函数的输出,然后再将其处理为json(如@apokryfos建议的)

#1


4  

Unfortunately, json_encode cannot generate a result from a generator function. Using iterator_to_array will still try to create the whole array, which will still cause memory issues.

不幸的是,json_encode不能从生成器函数生成结果。使用iterator_to_array仍然尝试创建整个数组,这仍然会导致内存问题。

You will need to create your function that will generate the json string from the generator function. Here's an example of how that could look:

您将需要创建将从生成器函数生成json字符串的函数。这里有一个例子:

function json_encode_generator(callable $generator) {
    $result = '[';

    foreach ($generator as $value) {
        $result .= json_encode($value) . ',';
    }

    return trim($result, ',') . ']';
}

Instead of encoding the whole array at once, it encodes only one object at a time and concatenates the results into one string.

它不是一次编码整个数组,而是一次只编码一个对象,并将结果连接到一个字符串中。

The above example only takes care of encoding an array, but it can be easily extended to recursively encoding whole objects.

上面的示例只负责对数组进行编码,但是可以很容易地扩展到递归地编码整个对象。

If the created string is still too big to fit in the memory, then your only remaining option is to directly use an output stream. Here's how that could look:

如果创建的字符串仍然太大,无法装入内存,那么惟一的选项就是直接使用输出流。这看起来是这样的:

function json_encode_generator(callable $generator, $outputStream) {
    fwrite($outputStream, '[');

    foreach ($generator as $key => $value) {
        if ($key != 0) {
            fwrite($outputStream, ','); 
        }

        fwrite($outputStream, json_encode($value));
    }

    fwrite($outputStream, ']');
}

As you can see, the only difference is that we now use fwrite to write to the passed in stream instead of concatenating strings, and we also need to take care of the trailing comma in a different way.

正如您所看到的,惟一的区别是,我们现在使用fwrite来写入到流中传递而不是连接字符串,我们还需要以不同的方式处理后面的逗号。

#2


1  

What is a generator function?

A generator function is effectively a more compact and efficient way to write an Iterator. It allows you to define a function that will calculate and return values while you are looping over it:

生成器函数实际上是编写迭代器的一种更紧凑、更有效的方法。它允许您定义一个函数,当您循环使用它时,它将计算并返回值:

Also as per document from http://php.net/manual/en/language.generators.overview.php

也可以按照http://php.net/manual/en/language.generators.overview.php中的文档

Generators provide an easy way to implement simple iterators without the overhead or complexity of implementing a class that implements the Iterator interface.

生成器提供了一种简单的方法来实现简单的迭代器,而无需实现实现迭代器接口的类的开销或复杂性。

A generator allows you to write code that uses foreach to iterate over a set of data without needing to build an array in memory, which may cause you to exceed a memory limit, or require a considerable amount of processing time to generate. Instead, you can write a generator function, which is the same as a normal function, except that instead of returning once, a generator can yield as many times as it needs to in order to provide the values to be iterated over.

生成器允许您编写使用foreach迭代一组数据的代码,而不需要在内存中构建数组,这可能会导致您超出内存限制,或者需要大量的处理时间来生成。相反,您可以编写一个生成器函数,它与普通函数相同,只是生成器可以生成所需的次数,以提供要迭代的值,而不是只返回一次。

What is yield?

The yield keyword returns data from a generator function:

yield关键字返回生成器函数的数据:

The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that instead of stopping execution of the function and returning, yield instead provides a value to the code looping over the generator and pauses execution of the generator function.

生成器函数的核心是yield关键字。在最简单的形式中,yield语句看起来很像return语句,只不过它没有停止函数的执行并返回,而是提供了一个值给在生成器上循环的代码,并暂停了generator函数的执行。

So in your case to generate expected output you need to iterate output of arrayGenerator() function by using foreach loop or iterator before processind it to json (as suggested by @apokryfos)

因此,在生成预期输出时,需要使用foreach循环或iterator来迭代arrayGenerator()函数的输出,然后再将其处理为json(如@apokryfos建议的)