It's clear that any n-tuple can be represented by a bunch of nested 2-tuples. So why are they not the same thing in Haskell? Would this break something?
很明显,任何n元组都可以由一组嵌套的2元组表示。那么,在哈斯卡尔,为什么它们不一样呢?这将打破的东西吗?
Making these types equivalent would make writing functions on tuples much easier. For example, instead of defining zip,zip2,zip3,etc., you could define only a single zip function that would work for all tuples.
使这些类型等价将使在元组上编写函数更加容易。例如,不要定义zip、zip2、zip3等等。,您只能定义一个用于所有元组的zip函数。
Of course, you can work with nested 2-tuples, but it is ugly and there is no canonical way to perform the nesting (i.e. should we nest to the left or right?).
当然,您可以使用嵌套的2-tuples,但是它很难看,而且没有规范的方式来执行嵌套(例如,我们应该嵌套到左边还是右边?)
2 个解决方案
#1
35
The type (a,b,c,d)
has a different performance profile from (a,(b,(c,(d,()))))
. In general, indexing into an n-tuple takes O(1)
while indexing into an "hlist" of n nested tuples takes O(n)
.
类型(a、b、c、d)与(a、(b、(c、(d、())))有不同的性能概要。通常,对n个元组进行索引需要O(1),而对n个嵌套元组的“hlist”进行索引需要O(n)。
That said, you should check out Oleg's classic work on HLists. Using HLists requires extensive, and somewhat sketchy, use of type level programming. Many people find this unacceptable, and it was not available in early Haskell. Probably the best way to represent an HList today is with GADTs and DataKinds
也就是说,你应该看看奥列格在hlist上的经典作品。使用HLists要求大量使用类型级编程,而且有些粗略。许多人认为这是不可接受的,而且在早期的Haskell中是不可用的。也许现在表示HList的最好方法是使用GADTs和DataKinds
data HList ls where
Nil :: HList '[]
Cons :: x -> HList xs -> HList (x ': xs)
This give canonical nesting, and lets you write functions that work for all instances of this type. You could implement your multi way zipWith
using the same techniques as used in printf. A more interesting puzzle is to generate the appropriate lenses for this type (hint: use type level naturals and type families for indexing in).
这提供了规范嵌套,并允许您编写适用于此类型的所有实例的函数。您可以使用与printf中使用的相同技术实现您的多路zipWith。一个更有趣的难题是为这种类型生成适当的镜头(提示:在索引中使用类型级自然属性和类型族)。
I have considered writing an HList like library that used arrays and unsafeCoerce
under the hood to get tuple like performance while sticking to a generic interface. I haven't done it, but it should not be overly difficult.
我曾考虑过编写一个HList类库,该库使用数组和unsafeCoerce在后台获得类似tuple的性能,同时还坚持使用通用接口。我没有做过,但也不应该太难。
EDIT: the more I think about this the more inclined I am to hack something together when I have the time. The repeated copying problem Andreas Rossberg mentions can probably be eliminated using stream fusion or similar techniques.
编辑:我越想这件事,就越有时间一起去搞点东西。重复的复制问题,Andreas Rossberg提到的可能可以用流融合或类似的技术消除。
#2
23
The main problem with this in Haskell would be that a nested tuple allows additional values, due to laziness. For example, the type (a,(b,())
is inhabited by all (x,_|_)
or (x,(y,_|_))
, which is not the case for flat tuples. The existence of these values is not only semantically inconvenient, it also would make tuples much more difficult to optimise.
Haskell中的主要问题是嵌套的元组由于惰性而允许附加值。例如,类型(a,(b,()))被所有(x,_|_)或(x,(y,_|_))居住,这不是平坦元组的情况。这些值的存在不仅在语义上不方便,而且会使元组更加难以优化。
In a strict language, though, your suggestion is indeed a possibility. But it still introduces a performance pitfall: implementations would still want to flatten tuples. Consequently, in the cases where you actually construct or deconstruct them inductively, they would have to do a lot of repeated copying. When you use really large tuples, that might be a problem.
但是,在严格的语言中,你的建议确实是有可能的。但是它仍然引入了一个性能陷阱:实现仍然想要压平元组。因此,在实际构造或解构它们的情况下,它们必须进行大量的重复复制。当您使用非常大的元组时,这可能是一个问题。
#1
35
The type (a,b,c,d)
has a different performance profile from (a,(b,(c,(d,()))))
. In general, indexing into an n-tuple takes O(1)
while indexing into an "hlist" of n nested tuples takes O(n)
.
类型(a、b、c、d)与(a、(b、(c、(d、())))有不同的性能概要。通常,对n个元组进行索引需要O(1),而对n个嵌套元组的“hlist”进行索引需要O(n)。
That said, you should check out Oleg's classic work on HLists. Using HLists requires extensive, and somewhat sketchy, use of type level programming. Many people find this unacceptable, and it was not available in early Haskell. Probably the best way to represent an HList today is with GADTs and DataKinds
也就是说,你应该看看奥列格在hlist上的经典作品。使用HLists要求大量使用类型级编程,而且有些粗略。许多人认为这是不可接受的,而且在早期的Haskell中是不可用的。也许现在表示HList的最好方法是使用GADTs和DataKinds
data HList ls where
Nil :: HList '[]
Cons :: x -> HList xs -> HList (x ': xs)
This give canonical nesting, and lets you write functions that work for all instances of this type. You could implement your multi way zipWith
using the same techniques as used in printf. A more interesting puzzle is to generate the appropriate lenses for this type (hint: use type level naturals and type families for indexing in).
这提供了规范嵌套,并允许您编写适用于此类型的所有实例的函数。您可以使用与printf中使用的相同技术实现您的多路zipWith。一个更有趣的难题是为这种类型生成适当的镜头(提示:在索引中使用类型级自然属性和类型族)。
I have considered writing an HList like library that used arrays and unsafeCoerce
under the hood to get tuple like performance while sticking to a generic interface. I haven't done it, but it should not be overly difficult.
我曾考虑过编写一个HList类库,该库使用数组和unsafeCoerce在后台获得类似tuple的性能,同时还坚持使用通用接口。我没有做过,但也不应该太难。
EDIT: the more I think about this the more inclined I am to hack something together when I have the time. The repeated copying problem Andreas Rossberg mentions can probably be eliminated using stream fusion or similar techniques.
编辑:我越想这件事,就越有时间一起去搞点东西。重复的复制问题,Andreas Rossberg提到的可能可以用流融合或类似的技术消除。
#2
23
The main problem with this in Haskell would be that a nested tuple allows additional values, due to laziness. For example, the type (a,(b,())
is inhabited by all (x,_|_)
or (x,(y,_|_))
, which is not the case for flat tuples. The existence of these values is not only semantically inconvenient, it also would make tuples much more difficult to optimise.
Haskell中的主要问题是嵌套的元组由于惰性而允许附加值。例如,类型(a,(b,()))被所有(x,_|_)或(x,(y,_|_))居住,这不是平坦元组的情况。这些值的存在不仅在语义上不方便,而且会使元组更加难以优化。
In a strict language, though, your suggestion is indeed a possibility. But it still introduces a performance pitfall: implementations would still want to flatten tuples. Consequently, in the cases where you actually construct or deconstruct them inductively, they would have to do a lot of repeated copying. When you use really large tuples, that might be a problem.
但是,在严格的语言中,你的建议确实是有可能的。但是它仍然引入了一个性能陷阱:实现仍然想要压平元组。因此,在实际构造或解构它们的情况下,它们必须进行大量的重复复制。当您使用非常大的元组时,这可能是一个问题。