是否存在阻止采用D范围的C ++语言障碍？

This is a C++ / D cross-over question. The D programming language has ranges that -in contrast to C++ libraries such as Boost.Range- are not based on iterator pairs. The official C++ Ranges Study Group seems to have been bogged down in nailing a technical specification.

这是一个C ++ / D交叉问题。与C ++库(如Boost.Range-)相比,D编程语言的范围不是基于迭代器对。官方的C ++ Ranges Study Group似乎陷入了制定技术规范的困境。

Question: does the current C++11 or the upcoming C++14 Standard have any obstacles that prevent adopting D ranges -as well as a suitably rangefied version of <algorithm>- wholesale?

问题:当前的C ++ 11或即将推出的C ++ 14标准是否有任何阻碍采用D范围的障碍 - 以及 <算法> 的适当范围版本 - 批发?

I don't know D or its ranges well enough, but they seem lazy and composable as well as capable of providing a superset of the STL's algorithms. Given their claim to success for D, it would seem very nice to have as a library for C++. I wonder how essential D's unique features (e.g. string mixins, uniform function call syntax) were for implementing its ranges, and whether C++ could mimic that without too much effort (e.g. C++14 constexpr seems quite similar to D compile-time function evaluation)

我不太了解D或它的范围,但它们似乎是懒惰和可组合的,并且能够提供STL算法的超集。鉴于他们声称D的成功,作为C ++库可能会非常好。我想知道D的独特功能(例如字符串mixins,统一函数调用语法)是如何实现其范围的,以及C ++是否可以模仿而不需要太多努力(例如C ++ 14 constexpr似乎与D编译时函数评估非常相似) )

Note: I am seeking technical answers, not opinions whether D ranges are the right design to have as a C++ library.

注意:我正在寻求技术答案,而不是D系列作为C ++库的正确设计。

2 个解决方案

#1

I don't think there is any inherent technical limitation in C++ which would make it impossible to define a system of D-style ranges and corresponding algorithms in C++. The biggest language level problem would be that C++ range-based for-loops require that begin() and end() can be used on the ranges but assuming we would go to the length of defining a library using D-style ranges, extending range-based for-loops to deal with them seems a marginal change.

我不认为C ++中存在任何固有的技术限制,这使得无法在C ++中定义D风格范围和相应算法的系统。最大的语言级问题是C ++基于范围的for循环要求begin()和end()可以在范围上使用,但假设我们将使用D样式范围定义库的长度,扩展范围基于for循环处理它们似乎是一个微小的变化。

The main technical problem I have encountered when experimenting with algorithms on D-style ranges in C++ was that I couldn't make the algorithms as fast as my iterator (actually, cursor) based implementations. Of course, this could just be my algorithm implementations but I haven't seen anybody providing a reasonable set of D-style range based algorithms in C++ which I could profile against. Performance is important and the C++ standard library shall provide, at least, weakly efficient implementations of algorithms (a generic implementation of an algorithm is called weakly efficient if it is at least as fast when applied to a data structure as a custom implementation of the same algorithm using the same data structure using the same programming language). I wasn't able to create weakly efficient algorithms based on D-style ranges and my objective are actually strongly efficient algorithms (similar to weakly efficient but allowing any programming language and only assuming the same underlying hardware).

我在C ++中使用D风格范围的算法进行实验时遇到的主要技术问题是我无法使算法与基于迭代器(实际上是游标)的实现一样快。当然,这可能只是我的算法实现,但我没有看到任何人在C ++中提供一套合理的基于D风格的算法,我可以对其进行分析。性能很重要,C ++标准库至少应提供弱效的算法实现(如果算法的通用实现在应用于数据结构时至少与其自定义实现一样快,则称其效率很低算法使用相同的编程语言使用相同的数据结构)。我无法创建基于D风格范围的弱效算法,我的目标实际上是非常高效的算法(类似于弱效但允许任何编程语言并且只假设相同的底层硬件)。

When experimenting with D-style range based algorithms I found the algorithms a lot harder to implement than iterator-based algorithms and found it necessary to deal with kludges to work around some of their limitations. Of course, not everything in the current way algorithms are specified in C++ is perfect either. A rough outline of how I want to change the algorithms and the abstractions they work with is on may STL 2.0 page. This page doesn't really deal much with ranges, however, as this is a related but somewhat different topic. I would rather envision iterator (well, really cursor) based ranges than D-style ranges but the question wasn't about that.

在尝试使用基于D风格的算法时,我发现算法比基于迭代器的算法更难实现,并且发现有必要处理kludges来解决它们的一些局限性。当然,并非C ++中指定的当前方式算法中的所有内容都是完美的。关于我如何更改算法及其使用的抽象的概述在may STL 2.0页面上。然而,这个页面并没有真正涉及范围,因为这是一个相关但有些不同的主题。我宁愿设想基于范围的迭代器(好的,真正的游标)而不是D风格的范围,但问题不在于此。

One technical problem all range abstractions in C++ do face is having to deal with temporary objects in a reasonable way. For example, consider this expression:

C ++中所有范围抽象的一个技术问题是必须以合理的方式处理临时对象。例如,考虑以下表达式:

auto result = ranges::unique(ranges::sort(std::vector<int>{ read_integers() }));

In dependent of whether ranges::sort() or ranges::unique() are lazy or not, the representation of the temporary range needs to be dealt with. Merely providing a view of the source range isn't an option for either of these algorithms because the temporary object will go away at the end of the expression. One possibility could be to move the range if it comes in as r-value, requiring different result for both ranges::sort() and ranges::unique() to distinguish the cases of the actual argument being either a temporary object or an object kept alive independently. D doesn't have this particular problem because it is garbage collected and the source range would, thus, be kept alive in either case.

取决于range :: sort()或range :: unique()是否是惰性的,需要处理临时范围的表示。仅提供源范围的视图不是这些算法中的任何一个的选项,因为临时对象将在表达式的末尾消失。一种可能性是移动范围,如果它作为r值进入,需要两个范围:: sort()和range :: unique()的不同结果,以区分实际参数是临时对象或物体独立存活。 D没有这个特殊的问题,因为它是垃圾收集的,因此,在任何一种情况下,源范围都将保持活跃。

The above example also shows one of the problems with possibly lazy evaluated algorithm: since any type, including types which can't be spelled out otherwise, can be deduced by auto variables or templated functions, there is nothing forcing the lazy evaluation at the end of an expression. Thus, the results from the expression templates can be obtained and the algorithm isn't really executed. That is, if an l-value is passed to an algorithm, it needs to be made sure that the expression is actually evaluated to obtain the actual effect. For example, any sort() algorithm mutating the entire sequence clearly does the mutation in-place (if you want a version doesn't do it in-place just copy the container and apply the in-place version; if you only have a non-in-place version you can't avoid the extra sequence which may be an immediate problem, e.g., for gigantic sequences). Assuming it is lazy in some way the l-value access to the original sequence provides a peak into the current status which is almost certainly a bad thing. This may imply that lazy evaluation of mutating algorithms isn't such a great idea anyway.

上面的例子也显示了可能延迟评估算法的一个问题:因为任何类型,包括不能拼写的类型,都可以通过自动变量或模板化函数推导出来,没有什么可以在最后强制进行惰性求值。一个表达。因此,可以获得表达模板的结果,并且不会真正执行算法。也就是说,如果将l值传递给算法,则需要确保实际评估表达式以获得实际效果。例如,任何使整个序列变异的sort()算法都可以在原位进行突变(如果你想要一个版本不就地复制容器并应用就地版本;如果你只有一个非原位版本,你无法避免可能是一个直接问题的额外序列,例如,对于巨大的序列)。假设它在某种程度上是懒惰的,对原始序列的l值访问提供了当前状态的峰值,这几乎肯定是一件坏事。这可能意味着对变异算法的懒惰评估无论如何都不是一个好主意。

In any case, there are some aspects of C++ which make it impossible to immediately adopt the D-sytle ranges although the same considerations also apply to other range abstractions. I'd think these considerations are, thus, somewhat out of scope for the question, too. Also, the obvious "solution" to the first of the problems (add garbage collection) is unlikely to happen. I don't know if there is a solution to the second problem in D. There may emerge a solution to the second problem (tentatively dubbed operator auto) but I'm not aware of a concrete proposal or how such a feature would actually look like.

在任何情况下,C ++的某些方面使得不可能立即采用D-sytle范围,尽管相同的考虑因素也适用于其他范围的抽象。因此,我认为这些考虑因素在某种程度上也超出了问题的范围。此外,第一个问题(添加垃圾收集)的明显“解决方案”不太可能发生。我不知道D中的第二个问题是否有解决方案。可能会出现第二个问题的解决方案(暂时称为运营商自动)但我不知道具体的提案或这样的功能实际上看起来如何喜欢。

BTW, the Ranges Study Group isn't really bogged down by any technical details. So far, we merely tried to find out what problems we are actually trying to solve and to scope out, to some extend, the solution space. Also, groups generally don't get any work done, at all! The actual work is always done by individuals, often by very few individuals. Since a major part of the work is actually designing a set of abstractions I would expect that the foundations of any results of the Ranges Study Group is done by 1 to 3 individuals who have some vision of what is needed and how it should look like.

顺便说一下,Ranges Study Group并没有因任何技术细节而陷入困境。到目前为止,我们只是试图找出我们实际尝试解决的问题,并在一定程度上确定解决方案空间的范围。此外,小组通常根本没有完成任何工作!实际工作总是由个人完成,通常由极少数人完成。由于工作的主要部分实际上是设计一组抽象,我希望Ranges Study Group的任何结果的基础都是由1到3个人完成的,他们对需要的东西以及它应该是什么样子有一些看法。

#2

My C++11 knowledge is much more limited than I'd like it to be, so there may be newer features which improve things that I'm not aware of yet, but there are three areas that I can think of at the moment which are at least problematic: template constraints, static if, and type introspection.

我的C ++ 11知识比我想要的更有限,所以可能有更新的功能可以改进我还不知道的东西,但目前我可以想到三个方面这至少是有问题的:模板约束,静态if和类型内省。

In D, a range-based function will usually have a template constraint on it indicating which type of ranges it accepts (e.g. forward range vs random-access range). For instance, here's a simplified signature for std.algorithm.sort:

在D中,基于范围的函数通常在其上具有模板约束,指示它接受哪种类型的范围(例如,前向范围与随机访问范围)。例如,这是std.algorithm.sort的简化签名:

auto sort(alias less = "a < b", Range)(Range r)
    if(isRandomAccessRange!Range &&
       hasSlicing!Range &&
       hasLength!Range)
{...}

It checks that the type being passed in is a random-access range, that it can be sliced, and that it has a length property. Any type which does not satisfy those requirements will not compile with sort, and when the template constraint fails, it makes it clear to the programmer why their type won't work with sort (rather than just giving a nasty compiler error from in the middle of the templated function when it fails to compile with the given type).

它检查传入的类型是随机访问范围,可以切片,还是具有length属性。任何不满足这些要求的类型都不会使用sort进行编译,并且当模板约束失败时,它会让程序员明白为什么他们的类型不能用于排序(而不是只是在中间给出一个讨厌的编译器错误)模板化函数无法使用给定类型进行编译时的情况。

Now, while that may just seem like a usability improvement over just giving a compilation error when sort fails to compile because the type doesn't have the right operations, it actually has a large impact on function overloading as well as type introspection. For instance, here are two of std.algorithm.find's overloads:

现在,虽然由于类型没有正确的操作,但是当排序无法编译时,这似乎只是提供编译错误的可用性改进,它实际上对函数重载以及类型内省有很大的影响。例如,以下是std.algorithm.find的两个重载:

R find(alias pred = "a == b", R, E)(R haystack, E needle)
    if(isInputRange!R &&
       is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{...}


R1 find(alias pred = "a == b", R1, R2)(R1 haystack, R2 needle)
    if(isForwardRange!R1 && isForwardRange!R2 &&
       is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
       !isRandomAccessRange!R1)
{...}

The first one accepts a needle which is only a single element, whereas the second accepts a needle which is a forward range. The two are able to have different parameter types based purely on the template constraints and can have drastically different code internally. Without something like template constraints, you can't have templated functions which are overloaded on attributes of their arguments (as opposed to being overloaded on the specific types themselves), which makes it much harder (if not impossible) to have different implementations based on the genre of range being used (e.g. input range vs forward range) or other attributes of the types being used. Some work has been being done in this area in C++ with concepts and similar ideas, but AFAIK, C++ is still seriously lacking in the features necessary to overload templates (be they templated functions or templated types) based on the attributes of their argument types rather than specializing on specific argument types (as occurs with template specialization).

第一个接受仅仅是单个元件的针,而第二个接受针是前向范围的针。这两者可以完全基于模板约束而具有不同的参数类型,并且可以在内部具有完全不同的代码。如果没有类似模板约束的东西,你就不能拥有在参数属性上重载的模板化函数(而不是在特定类型本身上重载),这使得基于的不同实现变得更加困难(如果不是不可能的话)正在使用的范围类型(例如输入范围与前向范围)或所使用类型的其他属性。 C ++中有一些关于概念和类似想法的工作已经在这个领域做了很多工作,但是AFAIK,C ++仍然严重缺乏根据参数类型的属性重载模板(模板函数或模板类型)所需的特性。而不是专门针对特定的参数类型(与模板专门化一样)。

A related feature would be static if. It's the same as if, except that its condition is evaluated at compile time, and whether it's true or false will actually determine which branch is compiled in as opposed to which branch is run. It allows you to branch code based on conditions known at compile time. e.g.

如果相关的功能是静态的。除了在编译时评估其条件之外,它是相同的,并且它是真还是假将实际确定编译哪个分支而不是运行哪个分支。它允许您根据编译时已知的条件分支代码。例如

static if(isDynamicArray!T)
{}
else
{}

static if(isRandomAccessRange!Range)
{}
else static if(isBidirectionalRange!Range)
{}
else static if(isForwardRange!Range)
{}
else static if(isInputRange!Range)
{}
else
    static assert(0, Range.stringof ~ " is not a valid range!");

static if can to some extent obviate the need for template constraints, as you can essentially put the overloads for a templated function within a single function. e.g.

静态,如果可以在某种程度上消除模板约束的需要,因为你可以基本上将模板化函数的重载放在单个函数中。例如

R find(alias pred = "a == b", R, E)(R haystack, E needle)
{
    static if(isInputRange!R &&
       is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
    {...}
    else static if(isForwardRange!R1 && isForwardRange!R2 &&
       is(typeof(binaryFun!pred(haystack.front, needle.front)) : bool) &&
       !isRandomAccessRange!R1)
    {...}
}

but that still results in nastier errors when compilation fails and actually makes it so that you can't overload the template (at least with D's implementation), because overloading is determined before the template is instantiated. So, you can use static if to specialize pieces of a template implementation, but it doesn't quite get you enough of what template constraints get you to not need template constraints (or something similar).

但是当编译失败时仍然会导致更糟糕的错误并且实际上使得你不能重载模板(至少使用D的实现),因为在模板实例化之前确定了重载。因此,您可以使用static if来专门化模板实现的各个部分,但它并不能让您充分了解哪些模板约束使您不需要模板约束(或类似的东西)。

Rather, static if is excellent for doing stuff like specializing only a piece of your function's implementation or for making it so that a range type can properly inherit the attributes of the range type that it's wrapping. For instance, if you call std.algorithm.map on an array of integers, the resultant range can have slicing (because the source range does), whereas if you called map on a range which didn't have slicing (e.g. the ranges returned by std.algorithm.filter can't have slicing), then the resultant ranges won't have slicing. In order to do that, map uses static if to compile in opSlice only when the source range supports it. Currently, map 's code that does this looks like

相反,静态if非常适合做一些事情,比如只专门化一部分函数的实现,或者使它成为一个范围类型可以正确地继承它包装的范围类型的属性。例如,如果在整数数组上调用std.algorithm.map,则结果范围可以有切片(因为源范围有),而如果在没有切片的范围上调用map(例如返回的范围)通过std.algorithm.filter不能进行切片),那么结果范围将不会有切片。为了做到这一点,map仅在源范围支持时使用static if在opSlice中编译。目前,map的代码就是这样的

static if (hasSlicing!R)
{
    static if (is(typeof(_input[ulong.max .. ulong.max])))
        private alias opSlice_t = ulong;
    else
        private alias opSlice_t = uint;

    static if (hasLength!R)
    {
        auto opSlice(opSlice_t low, opSlice_t high)
        {
            return typeof(this)(_input[low .. high]);
        }
    }
    else static if (is(typeof(_input[opSlice_t.max .. $])))
    {
        struct DollarToken{}
        enum opDollar = DollarToken.init;
        auto opSlice(opSlice_t low, DollarToken)
        {
            return typeof(this)(_input[low .. $]);
        }

        auto opSlice(opSlice_t low, opSlice_t high)
        {
            return this[low .. $].take(high - low);
        }
    }
}

This is code in the type definition of map's return type, and whether that code is compiled in or not depends entirely on the results of the static ifs, none of which could be replaced with template specializations based on specific types without having to write a new specialized template for map for every new type that you use with it (which obviously isn't tenable). In order to compile in code based on attributes of types rather than with specific types, you really need something like static if (which C++ does not currently have).

这是map的返回类型的类型定义中的代码,以及该代码是否编译完全取决于静态ifs的结果,其中任何一个都不能替换为基于特定类型的模板特化而不必编写新的与您一起使用的每种新类型的地图的专用模板(显然不是站得住脚的)。为了在代码中基于类型的属性而不是特定类型进行编译,你真的需要像static那样的东西(C ++当前没有)。

The third major item which C++ is lacking (and which I've more or less touched on throughout) is type introspection. The fact that you can do something like is(typeof(binaryFun!pred(haystack.front, needle)) : bool) or isForwardRange!Range is crucial. Without the ability to check whether a particular type has a particular set of attributes or that a particular piece of code compiles, you can't even write the conditions which template constraints and static if use. For instance, std.range.isInputRange looks something like this

C ++缺乏的第三个主要项目(我或多或少都涉及到它)是类型内省。事实上,你可以做类似的事情(typeof(binaryFun!pred(haystack.front,needle)):bool)或isForwardRange!范围是至关重要的。如果没有能力检查特定类型是否具有特定的属性集或者是否编译了特定的代码,您甚至无法编写模板约束条件和静态条件(如果使用)。例如,std.range.isInputRange看起来像这样

template isInputRange(R)
{
    enum bool isInputRange = is(typeof(
    {
        R r = void;       // can define a range object
        if (r.empty) {}   // can test for empty
        r.popFront();     // can invoke popFront()
        auto h = r.front; // can get the front of the range
    }));
}

It checks that a particular piece of code compiles for the given type. If it does, then that type can be used as an input range. If it doesn't, then it can't. AFAIK, it's impossible to do anything even vaguely like this in C++. But to sanely implement ranges, you really need to be able to do stuff like have isInputRange or test whether a particular type compiles with sort - is(typeof(sort(myRange))). Without that, you can't specialize implementations based on what types of operations a particular range supports, you can't properly forward the attributes of a range when wrapping it (and range functions wrap their arguments in new ranges all the time), and you can't even properly protect your function against being compiled with types which won't work with it. And, of course, the results of static if and template constraints also affect the type introspection (as they affect what will and won't compile), so the three features are very much interconnected.

它检查特定的代码片段是否针对给定类型进行编译。如果是,那么该类型可以用作输入范围。如果没有,那就不行。 AFAIK,在C ++中甚至不可能像这样模糊地做任何事情。但是要明智地实现范围,你真的需要能够做像isInputRange这样的东西或测试特定类型是否使用sort编译 - 是(typeof(sort(myRange)))。如果没有这个,你不能根据特定范围支持的操作类型来专门化实现,在包装它时,你无法正确转发范围的属性(并且范围函数一直将它们的参数包装在新的范围中),以及你甚至无法正确保护你的功能,防止使用不能使用它的类型进行编译。当然,静态if和模板约束的结果也会影响类型内省(因为它们会影响将要编译的内容),因此这三个特性是非常相互关联的。

Really, the main reasons that ranges don't work very well in C++ are the some reasons that metaprogramming in C++ is primitive in comparison to metaprogramming in D. AFAIK, there's no reason that these features (or similar ones) couldn't be added to C++ and fix the problem, but until C++ has metaprogramming capabilities similar to those of D, ranges in C++ are going to be seriously impaired.

实际上,范围在C ++中不能很好地工作的主要原因是,与D. AFAIK中的元编程相比,C ++中的元编程是原始的一些原因,没有理由不能添加这些特征(或类似的特征)到C ++并修复问题,但在C ++具有类似于D的元编程功能之前,C ++中的范围将严重受损。

Other features such as mixins and Uniform Function Call Syntax would also help, but they're nowhere near as fundamental. Mixins would help primarily with reducing code duplication, and UFCS helps primarily with making it so that generic code can just call all functions as if they were member functions so that if a type happens to define a particular function (e.g. find) then that would be used instead of the more general, free function version (and the code still works if no such member function is declared, because then the free function is used). UFCS is not fundamentally required, and you could even go the opposite direction and favor free functions for everything (like C++11 did with begin and end), though to do that well, it essentially requires that the free functions be able to test for the existence of the member function and then call the member function internally rather than using their own implementations. So, again you need type introspection along with static if and/or template constraints.

其他功能,如mixins和Uniform Function Call Syntax也会有所帮助,但它们远没有那么基本。 Mixins主要帮助减少代码重复,UFCS主要帮助实现它,使得通用代码可以像调用成员函数一样调用所有函数,这样如果一个类型恰好定义了一个特定的函数(例如find)那么那就是使用而不是更通用的*函数版本(如果没有声明这样的成员函数,代码仍然有效,因为那时使用*函数)。 UFCS并不是从根本上要求的,你甚至可以朝着相反的方向发展并支持所有功能的*功能(比如C ++ 11的开始和结束),尽管这样做很好,它本质上要求免费功能能够测试对于成员函数的存在,然后在内部调用成员函数而不是使用自己的实现。因此,您还需要键入内省以及静态if和/或模板约束。

As much as I love ranges, at this point, I've pretty much given up on attempting to do anything with them in C++, because the features to make them sane just aren't there. But if other folks can figure out how to do it, all the more power to them. Regardless of ranges though, I'd love to see C++ gain features such as template constraints, static if, and type introspection, because without them, metaprogramming is way less pleasant, to the point that while I do it all the time in D, I almost never do it in C++.

尽管我喜欢范围,但在这一点上,我几乎已经放弃尝试用C ++做任何事情,因为让它们理智的功能并不存在。但是,如果其他人能够弄清楚如何去做,那就更有力了。无论范围的,虽然,我很乐意看到C ++增益功能,如模板的限制,静若,然后键入反省,因为没有他们,元编程的方式不太愉快,给点意见,虽然我做这一切的时候在d,我几乎从不在C ++中这样做。

#1