Ruby相当于C#的'yield'关键字,或创建序列而不预先分配内存

时间:2021-05-02 23:30:55

In C#, you could do something like this:

在C#中,你可以这样做:

public IEnumerable<T> GetItems<T>()
{
    for (int i=0; i<10000000; i++) {
        yield return i;
    }
}

This returns an enumerable sequence of 10 million integers without ever allocating a collection in memory of that length.

这将返回一个包含1000万个整数的可枚举序列,而无需在该长度的内存中分配集合。

Is there a way of doing an equivalent thing in Ruby? The specific example I am trying to deal with is the flattening of a rectangular array into a sequence of values to be enumerated. The return value does not have to be an Array or Set, but rather some kind of sequence that can only be iterated/enumerated in order, not by index. Consequently, the entire sequence need not be allocated in memory concurrently. In .NET, this is IEnumerable and IEnumerable<T>.

有没有办法在Ruby中做同等的事情?我想要处理的具体示例是将矩形数组展平为要枚举的值序列。返回值不必是数组或集合,而是某种只能按顺序迭代/枚举的序列,而不是索引。因此,不需要在存储器中同时分配整个序列。在.NET中,这是IEnumerable和IEnumerable

Any clarification on the terminology used here in the Ruby world would be helpful, as I am more familiar with .NET terminology.

由于我更熟悉.NET术语,因此在Ruby世界中使用的术语的任何澄清都会有所帮助。

EDIT

Perhaps my original question wasn't really clear enough -- I think the fact that yield has very different meanings in C# and Ruby is the cause of confusion here.

也许我原来的问题还不够清楚 - 我认为在C#和Ruby中,yield的含义差别很大,这就引起了混乱。

I don't want a solution that requires my method to use a block. I want a solution that has an actual return value. A return value allows convenient processing of the sequence (filtering, projection, concatenation, zipping, etc).

我不想要一个需要我的方法来使用块的解决方案。我想要一个具有实际返回值的解决方案。返回值允许方便地处理序列(过滤,投影,连接,压缩等)。

Here's a simple example of how I might use get_items:

这是一个如何使用get_items的简单示例:

things = obj.get_items.select { |i| !i.thing.nil? }.map { |i| i.thing }

In C#, any method returning IEnumerable that uses a yield return causes the compiler to generate a finite state machine behind the scenes that caters for this behaviour. I suspect something similar could be achieved using Ruby's continuations, but I haven't seen an example and am not quite clear myself on how this would be done.

在C#中,任何返回使用yield返回的IEnumerable的方法都会导致编译器在幕后生成满足此行为的有限状态机。我怀疑使用Ruby的延续可以实现类似的东西,但我还没有看到一个例子,我自己也不清楚如何做到这一点。

It does indeed seem possible that I might use Enumerable to achieve this. A simple solution would be to us an Array (which includes module Enumerable), but I do not want to create an intermediate collection with N items in memory when it's possible to just provide them lazily and avoid any memory spike at all.

确实有可能使用Enumerable来实现这一目标。一个简单的解决方案是给我们一个Array(包括模块Enumerable),但我不想在内存中创建一个包含N个项目的中间集合,因为它可以只是懒得提供它们并且完全避免任何内存峰值。

If this still doesn't make sense, then consider the above code example. get_items returns an enumeration, upon which select is called. What is passed to select is an instance that knows how to provide the next item in the sequence whenever it is needed. Importantly, the whole collection of items hasn't been calculated yet. Only when select needs an item will it ask for it, and the latent code in get_items will kick into action and provide it. This laziness carries along the chain, such that select only draws the next item from the sequence when map asks for it. As such, a long chain of operations can be performed on one data item at a time. In fact, code structured in this way can even process an infinite sequence of values without any kinds of memory errors.

如果这仍然没有意义,那么请考虑上面的代码示例。 get_items返回一个枚举,在该枚举上调用select。传递给select的是一个知道如何在需要时提供序列中的下一个项目的实例。重要的是,尚未计算整个项目集。只有当select需要一个项目时它才会要求它,并且get_items中的潜在代码将开始行动并提供它。这种懒惰带有链,使得当地图要求时,select仅从序列中绘制下一个项目。这样,可以一次对一个数据项执行长链操作。实际上,以这种方式构造的代码甚至可以处理无限的值序列而不会出现任何类型的内存错误。

So, this kind of laziness is easily coded in C#, and I don't know how to do it in Ruby.

所以,这种懒惰很容易在C#中编码,我不知道如何在Ruby中编写它。

I hope that's clearer (I'll try to avoid writing questions at 3AM in future.)

我希望更清楚(我将尽力避免将来在凌晨3点写问题。)

4 个解决方案

#1


14  

It's supported by Enumerator since Ruby 1.9 (and back-ported to 1.8.7). See Generator: Ruby.

自Ruby 1.9以来,Enumerator支持它(后端移植到1.8.7)。参见Generator:Ruby。

Cliche example:

fib = Enumerator.new do |y|
  y.yield i = 0
  y.yield j = 1
  while true
    k = i + j
    y.yield k
    i = j
    j = k
  end
end

100.times { puts fib.next() }

#2


5  

Your specific example is equivalent to 10000000.times, but let's assume for a moment that the times method didn't exist and you wanted to implement it yourself, it'd look like this:

你的具体例子相当于10000000.times,但让我们暂时假设时间方法不存在而你想自己实现它,它看起来像这样:

class Integer
  def my_times
    return enum_for(:my_times) unless block_given?
    i=0
    while i<self
      yield i
      i += 1
    end
  end
end

10000.my_times # Returns an Enumerable which will let
               # you iterate of the numbers from 0 to 10000 (exclusive)

Edit: To clarify my answer a bit:

编辑:澄清我的答案:

In the above example my_times can be (and is) used without a block and it will return an Enumerable object, which will let you iterate over the numbers from 0 to n. So it is exactly equivalent to your example in C#.

在上面的例子中,my_times可以(而且)在没有块的情况下使用,它将返回一个Enumerable对象,它可以让你迭代从0到n的数字。所以它完全等同于你在C#中的例子。

This works using the enum_for method. The enum_for method takes as its argument the name of a method, which will yield some items. It then returns an instance of class Enumerator (which includes the module Enumerable), which when iterated over will execute the given method and give you the items which were yielded by the method. Note that if you only iterate over the first x items of the enumerable, the method will only execute until x items have been yielded (i.e. only as much as necessary of the method will be executed) and if you iterate over the enumerable twice, the method will be executed twice.

这使用enum_for方法。 enum_for方法将方法的名称作为其参数,这将产生一些项。然后它返回一个Enumerator类的实例(包括模块Enumerable),迭代后将执行给定的方法,并为您提供该方法产生的项目。请注意,如果您只迭代枚举的前x个项,则该方法将仅执行直到x项已被生成(即,只执行该方法所需的数量),并且如果迭代可枚举两次,则方法将执行两次。

In 1.8.7+ it has become to define methods, which yield items, so that when called without a block, they will return an Enumerator which will let the user iterate over those items lazily. This is done by adding the line return enum_for(:name_of_this_method) unless block_given? to the beginning of the method like I did in my example.

在1.8.7+中,已经成为定义方法,它产生项目,因此当没有块调用时,它们将返回一个Enumerator,它将让用户懒洋洋地迭代这些项目。这是通过添加行返回enum_for(:name_of_this_method)来完成的,除非block_given?像我在我的例子中所做的那样开始。

#3


1  

Without having much ruby experience, what C# does in yield return is usually known as lazy evaluation or lazy execution: providing answers only as they are needed. It's not about allocating memory, it's about deferring computation until actually needed, expressed in a way similar to simple linear execution (rather than the underlying iterator-with-state-saving).

在没有太多ruby经验的情况下,C#在yield return中的作用通常被称为惰性求值或延迟执行:仅在需要时提供答案。它不是关于分配内存,而是关于在实际需要之前推迟计算,以类似于简单线性执行的方式表达(而不是基础迭代器与状态保存)。

A quick google turned up a ruby library in beta. See if it's what you want.

一个快速的谷歌出现了测试版的红宝石库。看看它是不是你想要的。

#4


-2  

C# ripped the 'yield' keyword right out of Ruby- see Implementing Iterators here for more.

C#从Ruby中删除了'yield'关键字 - 请参阅此处的实现迭代器以获取更多信息。

As for your actual problem, you have presumably an array of arrays and you want to create a one-way iteration over the complete length of the list? Perhaps worth looking at array.flatten as a starting point - if the performance is alright then you probably don't need to go too much further.

至于你的实际问题,你可能是一个数组数组,你想在列表的整个长度上创建一个单向迭代?也许值得将array.flatten视为一个起点 - 如果性能不错,那么你可能不需要进一步了解。

#1


14  

It's supported by Enumerator since Ruby 1.9 (and back-ported to 1.8.7). See Generator: Ruby.

自Ruby 1.9以来,Enumerator支持它(后端移植到1.8.7)。参见Generator:Ruby。

Cliche example:

fib = Enumerator.new do |y|
  y.yield i = 0
  y.yield j = 1
  while true
    k = i + j
    y.yield k
    i = j
    j = k
  end
end

100.times { puts fib.next() }

#2


5  

Your specific example is equivalent to 10000000.times, but let's assume for a moment that the times method didn't exist and you wanted to implement it yourself, it'd look like this:

你的具体例子相当于10000000.times,但让我们暂时假设时间方法不存在而你想自己实现它,它看起来像这样:

class Integer
  def my_times
    return enum_for(:my_times) unless block_given?
    i=0
    while i<self
      yield i
      i += 1
    end
  end
end

10000.my_times # Returns an Enumerable which will let
               # you iterate of the numbers from 0 to 10000 (exclusive)

Edit: To clarify my answer a bit:

编辑:澄清我的答案:

In the above example my_times can be (and is) used without a block and it will return an Enumerable object, which will let you iterate over the numbers from 0 to n. So it is exactly equivalent to your example in C#.

在上面的例子中,my_times可以(而且)在没有块的情况下使用,它将返回一个Enumerable对象,它可以让你迭代从0到n的数字。所以它完全等同于你在C#中的例子。

This works using the enum_for method. The enum_for method takes as its argument the name of a method, which will yield some items. It then returns an instance of class Enumerator (which includes the module Enumerable), which when iterated over will execute the given method and give you the items which were yielded by the method. Note that if you only iterate over the first x items of the enumerable, the method will only execute until x items have been yielded (i.e. only as much as necessary of the method will be executed) and if you iterate over the enumerable twice, the method will be executed twice.

这使用enum_for方法。 enum_for方法将方法的名称作为其参数,这将产生一些项。然后它返回一个Enumerator类的实例(包括模块Enumerable),迭代后将执行给定的方法,并为您提供该方法产生的项目。请注意,如果您只迭代枚举的前x个项,则该方法将仅执行直到x项已被生成(即,只执行该方法所需的数量),并且如果迭代可枚举两次,则方法将执行两次。

In 1.8.7+ it has become to define methods, which yield items, so that when called without a block, they will return an Enumerator which will let the user iterate over those items lazily. This is done by adding the line return enum_for(:name_of_this_method) unless block_given? to the beginning of the method like I did in my example.

在1.8.7+中,已经成为定义方法,它产生项目,因此当没有块调用时,它们将返回一个Enumerator,它将让用户懒洋洋地迭代这些项目。这是通过添加行返回enum_for(:name_of_this_method)来完成的,除非block_given?像我在我的例子中所做的那样开始。

#3


1  

Without having much ruby experience, what C# does in yield return is usually known as lazy evaluation or lazy execution: providing answers only as they are needed. It's not about allocating memory, it's about deferring computation until actually needed, expressed in a way similar to simple linear execution (rather than the underlying iterator-with-state-saving).

在没有太多ruby经验的情况下,C#在yield return中的作用通常被称为惰性求值或延迟执行:仅在需要时提供答案。它不是关于分配内存,而是关于在实际需要之前推迟计算,以类似于简单线性执行的方式表达(而不是基础迭代器与状态保存)。

A quick google turned up a ruby library in beta. See if it's what you want.

一个快速的谷歌出现了测试版的红宝石库。看看它是不是你想要的。

#4


-2  

C# ripped the 'yield' keyword right out of Ruby- see Implementing Iterators here for more.

C#从Ruby中删除了'yield'关键字 - 请参阅此处的实现迭代器以获取更多信息。

As for your actual problem, you have presumably an array of arrays and you want to create a one-way iteration over the complete length of the list? Perhaps worth looking at array.flatten as a starting point - if the performance is alright then you probably don't need to go too much further.

至于你的实际问题,你可能是一个数组数组,你想在列表的整个长度上创建一个单向迭代?也许值得将array.flatten视为一个起点 - 如果性能不错,那么你可能不需要进一步了解。