如何为类定义中定义的模板成员函数使用显式模板实例化?

时间:2020-12-05 06:51:11

In an effort to reduce compilation times in a large project that makes liberal use of templates, I've had good results using "extern template" (explicit template instantiation) to prevent common template functions from being defined in many different compilation units.

为了减少大型项目中*使用模板的编译时间,我使用了“extern template”(显式模板实例化)来防止在许多不同的编译单元中定义公共模板函数的效果良好。

However, one annoying thing about it is that it doesn't work for member functions defined within the class definition.

但是,有一件令人恼火的事情是,它不适合在类定义中定义的成员函数。

For example, I have the following template class:

例如,我有以下模板类:

template <typename T>
struct Foo
{
    static T doubleIt(T input)
    {
        return input * 2;
    }
};

Now, I know that Foo is most commonly used for numeric types, so I add this to the header:

现在,我知道Foo最常用于数值类型,所以我将这个添加到标题:

extern template struct Foo<int>;
extern template struct Foo<float>;
extern template struct Foo<double>;

And in a cpp file, add explicit instantiations:

在cpp文件中,添加显式实例化:

template struct Foo<int>;
template struct Foo<float>;
template struct Foo<double>;

This does not work, as dumpbin.exe on the obj file tells me:

就像垃圾箱一样,这行不通。obj文件上的exe告诉我:

017 00000000 SECT4  notype ()    External     | ?doubleIt@?$Foo@M@@SAMM@Z (public: static float __cdecl Foo<float>::doubleIt(float))

If I change my class definition to define the function outside the class header like so it works correctly:

如果我改变我的类定义来定义类头之外的函数,这样它就可以正常工作:

template <typename T>
struct Foo
{
    static T doubleIt(T input);
};

template <typename T>
T Foo::doubleIt(T input)
{
    return input * 2;
}

Which we can verify using dumpbin:

我们可以用dumpbin来验证:

017 00000000 UNDEF  notype ()    External     | ?doubleIt@?$Foo@M@@SAMM@Z (public: static float __cdecl Foo<float>::doubleIt(float))

The problem with that solution is that it is a lot of typing to move all the function definitions outside of the class definition, especially when you get more template parameters.

这个解决方案的问题在于,它需要大量的输入来移动类定义之外的所有函数定义,特别是当您获得更多的模板参数时。

I've tried using declspec(__noinline) but it still doesn't extern the functions correctly (and preventing the inlining of the function where possible is undesirable).

我尝试过使用declspec(__noinline),但它仍然不能正确地将函数外部化(并在可能的情况下防止函数内嵌)。

One thing that works is to enumerate each function individually, like so, but that of course is even more cumbersome:

一件有用的事情是逐一列出每个函数,就像这样,但那当然更麻烦:

extern template int Foo<int>::doubleIt(int);
extern template float Foo<float>::doubleIt(float);
extern template double Foo<double>::doubleIt(double);

What I would like is a way to keep the function definition inside of the class definition, while still allowing the function to be inlined where possible, but when it is not inlined, only creating it in the compilation unit where it is explicitly instantiated (in other words, exactly the same behavior as moving the function outside of the class definition).

我想是把函数定义的内部类定义,同时仍然允许内联函数在可能的情况下,但当它不是内联,只有创建它编译单元中显式实例化(换句话说,完全相同的行为作为移动函数在类定义之外)。

2 个解决方案

#1


1  

You can't have it both ways, in order to inline the method the compiler needs to use the source code, as the method is defined inline the compiler doesn't bother compiling it into an object file if it isn't used directly in that object (and even if it is if its inlined in all cases it wont be present in the object as a separate method). The compiler will always have to build your function if its defined in the header, somehow forcing the compiler to store a copy of that function in the object file wont improve performance.

你不能两者兼得,为了内联方法编译器需要使用源代码,编译器内联定义的方法是不去编译成目标文件如果不是直接用于该对象(即使它是如果它内联在所有情况下它不会出现在对象作为一个单独的方法)。如果在头文件中定义了函数,编译器将不得不构建函数,以某种方式强制编译器在对象文件中存储该函数的副本不会提高性能。

#2


-1  

As has been pointed out, you cannot have both extern and inlining, but about the extra typing part, I did something like that and tried to minimize it using the preprocessor. I'm not sure if you'd find that useful, but just in case, I'll put an example with a template class that has a template function inside.

正如已经指出的那样,您不能同时拥有外部和内联,但是关于额外的输入部分,我做了类似的事情,并尝试使用预处理器将其最小化。我不确定您是否会发现它有用,但为了以防万一,我将在模板类中加入一个包含模板函数的示例。

File Foo.h:

文件foo。:

template<typename T1>
struct Foo
{
    void bar(T1 input)
    {
        // ...
    }

    template<typename T2>
    void baz(T1 input1, T2 input2);
};
#include <Foo.inl>

File Foo.cc:

文件Foo.cc:

template<typename T1>
template<typename T2>
void Foo<T1>::baz(T1 input1, T2 input2)
{
    // ...
}
#define __FOO_IMPL
#include <Foo.inl>
#undef __FOO_IMPL

File Foo.inl:

文件Foo.inl:

#ifdef __FOO_IMPL
#define __FOO_EXTERN
#else
#define __FOO_EXTERN extern
#endif

#define __FOO_BAZ_INST(T1, T2) \
    __FOO_EXTERN template void Foo<T1>::baz<T2>(T1, T2);

#define __FOO_INST(T1) \
    __FOO_EXTERN template struct Foo<T1>; \
    __FOO_BAZ_INST(T1, int) \
    __FOO_BAZ_INST(T1, float) \
    __FOO_BAZ_INST(T1, double) \

__FOO_INST(int)
__FOO_INST(float)
__FOO_INST(double)

#undef __FOO_INST
#undef __FOO_BAZ_INST
#undef __FOO_EXTERN

So it is still quite some writing, but at least you don't have to be careful to keep in sync to different sets of template declarations, and you don't have to explicitly go through every possible combination of types. In my case, I had a class template with two type parameters and with a couple of member function templates with an extra type parameter, and each of them could take one in 12 possible types. 36 lines is better than 123 = 1728, although I would have preferred the preprocessor to somehow iterate through the list of types for each parameter, but couldn't work out how.

所以它仍然是相当多的写作,但至少您不需要注意与不同的模板声明集保持同步,也不需要显式地检查所有可能的类型组合。在我的例子中,我有一个带有两个类型参数的类模板,以及带有一个额外类型参数的几个成员函数模板,并且每一个都可以使用12种可能的类型。36行比123 = 1728行更好,尽管我更喜欢预处理器以某种方式遍历每个参数的类型列表,但我不知道如何遍历。

As a side note, in my case I was compiling a DLL where I needed all the templates to be compiled, so actually the template instantiations/declarations looked more like __FOO_EXTERN template __FOO_API ....

边注,就我而言我是编译一个DLL,我需要所有的模板编译,所以实际上模板实例化模板/声明更像是__FOO_EXTERN __FOO_API ....

#1


1  

You can't have it both ways, in order to inline the method the compiler needs to use the source code, as the method is defined inline the compiler doesn't bother compiling it into an object file if it isn't used directly in that object (and even if it is if its inlined in all cases it wont be present in the object as a separate method). The compiler will always have to build your function if its defined in the header, somehow forcing the compiler to store a copy of that function in the object file wont improve performance.

你不能两者兼得,为了内联方法编译器需要使用源代码,编译器内联定义的方法是不去编译成目标文件如果不是直接用于该对象(即使它是如果它内联在所有情况下它不会出现在对象作为一个单独的方法)。如果在头文件中定义了函数,编译器将不得不构建函数,以某种方式强制编译器在对象文件中存储该函数的副本不会提高性能。

#2


-1  

As has been pointed out, you cannot have both extern and inlining, but about the extra typing part, I did something like that and tried to minimize it using the preprocessor. I'm not sure if you'd find that useful, but just in case, I'll put an example with a template class that has a template function inside.

正如已经指出的那样,您不能同时拥有外部和内联,但是关于额外的输入部分,我做了类似的事情,并尝试使用预处理器将其最小化。我不确定您是否会发现它有用,但为了以防万一,我将在模板类中加入一个包含模板函数的示例。

File Foo.h:

文件foo。:

template<typename T1>
struct Foo
{
    void bar(T1 input)
    {
        // ...
    }

    template<typename T2>
    void baz(T1 input1, T2 input2);
};
#include <Foo.inl>

File Foo.cc:

文件Foo.cc:

template<typename T1>
template<typename T2>
void Foo<T1>::baz(T1 input1, T2 input2)
{
    // ...
}
#define __FOO_IMPL
#include <Foo.inl>
#undef __FOO_IMPL

File Foo.inl:

文件Foo.inl:

#ifdef __FOO_IMPL
#define __FOO_EXTERN
#else
#define __FOO_EXTERN extern
#endif

#define __FOO_BAZ_INST(T1, T2) \
    __FOO_EXTERN template void Foo<T1>::baz<T2>(T1, T2);

#define __FOO_INST(T1) \
    __FOO_EXTERN template struct Foo<T1>; \
    __FOO_BAZ_INST(T1, int) \
    __FOO_BAZ_INST(T1, float) \
    __FOO_BAZ_INST(T1, double) \

__FOO_INST(int)
__FOO_INST(float)
__FOO_INST(double)

#undef __FOO_INST
#undef __FOO_BAZ_INST
#undef __FOO_EXTERN

So it is still quite some writing, but at least you don't have to be careful to keep in sync to different sets of template declarations, and you don't have to explicitly go through every possible combination of types. In my case, I had a class template with two type parameters and with a couple of member function templates with an extra type parameter, and each of them could take one in 12 possible types. 36 lines is better than 123 = 1728, although I would have preferred the preprocessor to somehow iterate through the list of types for each parameter, but couldn't work out how.

所以它仍然是相当多的写作,但至少您不需要注意与不同的模板声明集保持同步,也不需要显式地检查所有可能的类型组合。在我的例子中,我有一个带有两个类型参数的类模板,以及带有一个额外类型参数的几个成员函数模板,并且每一个都可以使用12种可能的类型。36行比123 = 1728行更好,尽管我更喜欢预处理器以某种方式遍历每个参数的类型列表,但我不知道如何遍历。

As a side note, in my case I was compiling a DLL where I needed all the templates to be compiled, so actually the template instantiations/declarations looked more like __FOO_EXTERN template __FOO_API ....

边注,就我而言我是编译一个DLL,我需要所有的模板编译,所以实际上模板实例化模板/声明更像是__FOO_EXTERN __FOO_API ....