Haskell FFI进出C和回来要花多少钱?

时间:2022-10-05 17:04:21

If I want to call more than one C function, each one depending on the result of the previous one, is it better to create a wrapper C function that handles the three calls? Will it cost the same as using Haskell FFI without converting types?

如果我想调用多个C函数,每个函数都取决于前一个函数的结果,那么创建一个处理三个调用的包装器C函数是否更好呢?与使用Haskell FFI而不转换类型的成本相同吗?

Suppose I have the following Haskell code:

假设我有以下Haskell代码:

foo :: CInt -> IO CInt
foo x = do
  a <- cfA x
  b <- cfB a
  c <- cfC c
  return c

Each function cf* is a C call.

每个函数cf*都是一个C调用。

Is it better, in terms of performance, to create a single C function like cfABC and make only one foreign call in Haskell?

就性能而言,创建一个C函数(如cfABC)并在Haskell中只调用一个外调用是否更好?

int cfABC(int x) {
   int a, b, c;
   a = cfA(x);
   b = cfB(a);
   c = cfC(b);
   return c;
}

Haskell code:

Haskell代码:

foo :: CInt -> IO CInt
foo x = do
  c <- cfABC x
  return c

How to measure the performace cost of a C call from Haskell? Not the cost of the C function itself, but the cost of the "context-switching" from Haskell to C and back.

如何衡量Haskell C电话的性能成本?不是C函数本身的成本,而是从Haskell到C和返回的“上下文切换”的成本。

2 个解决方案

#1


18  

The answer depends mostly on whether the foreign call is a safe or an unsafe call.

答案主要取决于外国电话是安全的还是不安全的。

An unsafe C call is basically just a function call, so if there's no (nontrivial) type conversion, there are three function calls if you make three foreign calls, and between one and four when you write a wrapper in C, depending on how many of the component functions can be inlined when compiling the C, since a foreign call into C cannot be inlined by GHC. Such a function call is generally very cheap (it's just a copy of the arguments and a jump to the code), so the difference is small either way, the wrapper should be slightly slower when no C function can be inlined into the wrapper, and slightly faster when all can be inlined [and that was indeed the case in my benchmarking, +1.5ns resp. -3.5ns where the three foreign calls took about 12.7ns for everything just returning the argument]. If the functions do something nontrivial, the difference is negligible (and if they're not doing anything nontrivial, you'd probably better write them in Haskell to let GHC inline the code).

一个不安全的C调用基本上是函数调用,所以如果没有类型转换(重要的),如果你有三个函数调用三个外国电话,和一至四个当您编写一个包装器在C语言中,这取决于许多组件的功能可以内联编译C时,由于外国不能内联调用C GHC。这样一个函数调用通常是非常便宜(它只是一个复制参数和跳转的代码),所以不同的是小无论哪种方式,包装时应稍慢不能内联C函数包装器,并当所有可以内联稍快,确实是在我的基准测试,+ 1.5 ns职责。英语学习·日积月累英语学习·日积月累英语学习·日积月累英语学习·日积月累英语学习·日积月累如果函数做了一些非平凡的事情,那么差异是可以忽略的(如果它们没有做任何非平凡的事情,那么最好使用Haskell编写它们,让GHC内联代码)。

A safe C call involves saving some nontrivial amount of state, locking, possibly spawning a new OS thread, so that takes much longer. Then the small overhead of perhaps calling one function more in C is negligible compared to the cost of the foreign calls [unless passing the arguments requires an unusual amount of copying, many huge structs or so]. In my do-nothing benchmark

一个安全的C调用需要保存一些重要的状态,锁定,可能会生成一个新的OS线程,所以这需要花费更长的时间。那么,用C语言调用一个函数的开销可能比调用外国调用的开销要小得多(除非传递参数需要大量的复制,许多大型结构)。在我懒惰的基准

{-# LANGUAGE ForeignFunctionInterface #-}
module Main (main) where

import Criterion.Main
import Foreign.C.Types
import Control.Monad

foreign import ccall safe "funcs.h cfA" c_cfA :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfB" c_cfB :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfC" c_cfC :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfABC" c_cfABC :: CInt -> IO CInt

wrap :: (CInt -> IO CInt) -> Int -> IO Int
wrap foo arg = fmap fromIntegral $ foo (fromIntegral arg)

cfabc = wrap c_cfABC

foo :: Int -> IO Int
foo = wrap (c_cfA >=> c_cfB >=> c_cfC)

main :: IO ()
main = defaultMain
            [ bench "three calls" $ foo 16
            , bench "single call" $ cfabc 16
            ]

where all the C functions just return the argument, the mean for the single wrapped call is a bit above 100ns [105-112], and for the three separate calls around 300ns [290-315].

当所有C函数都只返回参数时,单个包装调用的均值略高于100ns[105-112],而三个单独调用的均值约为300ns[290-315]。

So a safe c call takes roughly 100ns and usually, it is then faster to wrap them up into a single call. But still, if the called functions do something sufficiently nontrivial, the difference won't matter.

所以一个安全的c呼叫大约需要100ns,通常,将它们打包成一个呼叫会更快。但是,如果被调用的函数做了一些足够不平凡的事情,那么差异就不重要了。

#2


-3  

That probably depends very much on your exact Haskell compiler, the C compiler, and the glue binding them together. The only way to find out for sure is to measure it.

这可能很大程度上取决于您的Haskell编译器、C编译器以及将它们绑定在一起的胶水。唯一确定的方法是测量它。

On a more philosophical tune, each time you mix languages you create a barrier for newcommers: In this case it isn't enough to be fluent in Haskell and C (that already gives a narrow set), but you also have to know the calling conventions and whatnot enough to work with them. And many times there are subtle issues to handle (even calling C from C++, which are very similar languages isn't at all trivial). Unless there are very compelling reasons, I'd stick with a single language. The only exception I can think of offhand is for creating e.g. Haskell bindings to a preexisting complex library, something like NumPy for Python.

更哲学的曲子,每次你把语言newcommers您创建一个障碍:在这种情况下它还不够流利Haskell和C(已经给一个狭窄的设置),但你也需要知道调用约定什么的足以与他们合作。很多时候都有一些微妙的问题需要处理(甚至从c++调用C,这是非常相似的语言)。除非有非常令人信服的理由,否则我将坚持使用单一语言。我能想到的惟一的例外是创建Haskell绑定到现有的复杂库,类似于Python的NumPy。

#1


18  

The answer depends mostly on whether the foreign call is a safe or an unsafe call.

答案主要取决于外国电话是安全的还是不安全的。

An unsafe C call is basically just a function call, so if there's no (nontrivial) type conversion, there are three function calls if you make three foreign calls, and between one and four when you write a wrapper in C, depending on how many of the component functions can be inlined when compiling the C, since a foreign call into C cannot be inlined by GHC. Such a function call is generally very cheap (it's just a copy of the arguments and a jump to the code), so the difference is small either way, the wrapper should be slightly slower when no C function can be inlined into the wrapper, and slightly faster when all can be inlined [and that was indeed the case in my benchmarking, +1.5ns resp. -3.5ns where the three foreign calls took about 12.7ns for everything just returning the argument]. If the functions do something nontrivial, the difference is negligible (and if they're not doing anything nontrivial, you'd probably better write them in Haskell to let GHC inline the code).

一个不安全的C调用基本上是函数调用,所以如果没有类型转换(重要的),如果你有三个函数调用三个外国电话,和一至四个当您编写一个包装器在C语言中,这取决于许多组件的功能可以内联编译C时,由于外国不能内联调用C GHC。这样一个函数调用通常是非常便宜(它只是一个复制参数和跳转的代码),所以不同的是小无论哪种方式,包装时应稍慢不能内联C函数包装器,并当所有可以内联稍快,确实是在我的基准测试,+ 1.5 ns职责。英语学习·日积月累英语学习·日积月累英语学习·日积月累英语学习·日积月累英语学习·日积月累如果函数做了一些非平凡的事情,那么差异是可以忽略的(如果它们没有做任何非平凡的事情,那么最好使用Haskell编写它们,让GHC内联代码)。

A safe C call involves saving some nontrivial amount of state, locking, possibly spawning a new OS thread, so that takes much longer. Then the small overhead of perhaps calling one function more in C is negligible compared to the cost of the foreign calls [unless passing the arguments requires an unusual amount of copying, many huge structs or so]. In my do-nothing benchmark

一个安全的C调用需要保存一些重要的状态,锁定,可能会生成一个新的OS线程,所以这需要花费更长的时间。那么,用C语言调用一个函数的开销可能比调用外国调用的开销要小得多(除非传递参数需要大量的复制,许多大型结构)。在我懒惰的基准

{-# LANGUAGE ForeignFunctionInterface #-}
module Main (main) where

import Criterion.Main
import Foreign.C.Types
import Control.Monad

foreign import ccall safe "funcs.h cfA" c_cfA :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfB" c_cfB :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfC" c_cfC :: CInt -> IO CInt
foreign import ccall safe "funcs.h cfABC" c_cfABC :: CInt -> IO CInt

wrap :: (CInt -> IO CInt) -> Int -> IO Int
wrap foo arg = fmap fromIntegral $ foo (fromIntegral arg)

cfabc = wrap c_cfABC

foo :: Int -> IO Int
foo = wrap (c_cfA >=> c_cfB >=> c_cfC)

main :: IO ()
main = defaultMain
            [ bench "three calls" $ foo 16
            , bench "single call" $ cfabc 16
            ]

where all the C functions just return the argument, the mean for the single wrapped call is a bit above 100ns [105-112], and for the three separate calls around 300ns [290-315].

当所有C函数都只返回参数时,单个包装调用的均值略高于100ns[105-112],而三个单独调用的均值约为300ns[290-315]。

So a safe c call takes roughly 100ns and usually, it is then faster to wrap them up into a single call. But still, if the called functions do something sufficiently nontrivial, the difference won't matter.

所以一个安全的c呼叫大约需要100ns,通常,将它们打包成一个呼叫会更快。但是,如果被调用的函数做了一些足够不平凡的事情,那么差异就不重要了。

#2


-3  

That probably depends very much on your exact Haskell compiler, the C compiler, and the glue binding them together. The only way to find out for sure is to measure it.

这可能很大程度上取决于您的Haskell编译器、C编译器以及将它们绑定在一起的胶水。唯一确定的方法是测量它。

On a more philosophical tune, each time you mix languages you create a barrier for newcommers: In this case it isn't enough to be fluent in Haskell and C (that already gives a narrow set), but you also have to know the calling conventions and whatnot enough to work with them. And many times there are subtle issues to handle (even calling C from C++, which are very similar languages isn't at all trivial). Unless there are very compelling reasons, I'd stick with a single language. The only exception I can think of offhand is for creating e.g. Haskell bindings to a preexisting complex library, something like NumPy for Python.

更哲学的曲子,每次你把语言newcommers您创建一个障碍:在这种情况下它还不够流利Haskell和C(已经给一个狭窄的设置),但你也需要知道调用约定什么的足以与他们合作。很多时候都有一些微妙的问题需要处理(甚至从c++调用C,这是非常相似的语言)。除非有非常令人信服的理由,否则我将坚持使用单一语言。我能想到的惟一的例外是创建Haskell绑定到现有的复杂库,类似于Python的NumPy。