size_type是否可以大于std::size_t?

Standard containers with an std::allocator have their size_type defined as std::size_t. However, is it possible to have an allocator that allocates objects whose size cannot be represented by a size_t? In other words, can a size_type ever be larger than size_t?

带有std::分配器的标准容器的size_type定义为std::size_t。但是，是否可能有一个分配程序来分配那些大小不能用size_t表示的对象?换句话说，size_type是否比size_t更大?

7 个解决方案

#1

Yes, and this could be useful in some cases.

是的，这在某些情况下是有用的。

Suppose you have a program that wishes to access more storage than will fit in virtual memory. By creating an allocator that references memory mapped storage and mapping it as required when indirecting pointer objects, you can access arbitrarily large amounts of memory.

假设有一个程序希望访问比虚拟内存更大的存储空间。通过创建一个分配程序来引用内存映射存储，并在指示指针对象时根据需要对其进行映射，您可以访问任意数量的内存。

This remains conformant to 18.2:6 because size_t is defined as large enough to contain the size of any object, but 17.6.3.5:2 table 28 defines size_type as containing the size of the largest object in the allocation model, which need not be an actual object in the C++ memory model.

由于size_t的定义足够大，可以容纳任意对象的大小，所以这仍然符合18.2:6，但是17.6.3.5 .2表28定义了size_type，它包含了分配模型中最大对象的大小，而这并不需要在c++内存模型中成为实际对象。

Note that the requirements in 17.6.3.5:2 table 28 do not constitute a requirement that the allocation of multiple objects should result in an array; for allocate(n) the requirement is:

注意，表28 17.6.3.5:2中的要求不构成一个要求，即分配多个对象应该产生一个数组;对于分配(n)，要求是:

Memory is allocated for n objects of type T

内存分配给n个类型为T的对象

and for deallocate the assertion is:

对于deallocate，断言为:

All n T objects in the area pointed to by p shall be destroyed prior to this call.

在此调用之前，p指向的区域内的所有nt对象都应该被销毁。

Note area, not array. Another point is 17.6.3.5:4:

注意区域,而不是数组。另一点是17.6.3.5:4:

The X::pointer, X::const_pointer, X::void_pointer, and X::const_void_pointer types shall satisfy the requirements of NullablePointer (17.6.3.3). No constructor, comparison operator, copy operation, move operation, or swap operation on these types shall exit via an exception. X::pointer and X::const_pointer shall also satisfy the requirements for a random access iterator (24.2).

指针，X::const_pointer, X::void_pointer, X::const_void_pointer类型应满足NullablePointer(17.6.3.3)的要求。任何对这些类型的构造函数、比较操作符、复制操作、移动操作或交换操作都不能通过异常退出。X::指针和X::const_pointer也应该满足随机访问迭代器(24.2)的要求。

There is no requirement here that (&*p) + n should be the same as p + n.

这里没有要求(&*p) + n应该等于p + n。

It's perfectly legitimate for a model expressible within another model to contain objects not representable in the outer model; for example, non-standard models in mathematical logic.

在另一个模型中可表达的模型包含外部模型中不可表示的对象是完全合理的;例如，数学逻辑中的非标准模型。

#2

size_t is the type of the unsigned integer you get by applying sizeof.

size_t是通过应用sizeof得到的无符号整数类型。

sizeof should return the size of the type (or of the type of the expression) that is his argument. In case of arrays it should return the size of the whole array.

sizeof应该返回其参数的类型(或表达式的类型)的大小。对于数组，它应该返回整个数组的大小。

This implies that:

这意味着:

there cannot be ANY structure or union that is larger than what size_t can represent.

任何结构或联合都不能大于size_t可以表示的。
there cannot be any array that is larger than what size_t can represent.

任何数组都不能大于size_t可以表示的。

In other words, if something fits in the largest block of consecutive memory that you can access, then its size must fit in size_t (in non-portable, but easy to grasp intuitively terms this means that on most systems size_t is as large as void* and can 'measure' the whole of your virtual address space).

换句话说,如果符合最大的连续内存块,您可以访问,那么它的大小必须符合size_t(不可移植,但容易掌握直觉而言,这意味着在大多数系统size_t void *一样大,可以测量的整个虚拟地址空间)。

Edit: this next sentence is probably wrong. See below

编辑:下一句话可能是错的。见下文

Therefore the answer to is it possible to have an allocator that allocates objects whose size cannot be represented by a size_t? is no.

因此，问题的答案是，是否可能有一个分配程序来分配那些大小不能由size_t表示的对象?是否定的。

Edit (addendum):

编辑(补充):

I've been thinking about it and the above my be in fact wrong. I've checked the standard and it seems to be possible to design a completely custom allocator with completely custom pointer types, including using different types for pointer, const pointer, void pointer and const void pointer. Therefore an allocator can in fact have a size_type that is larger than size_t.

我一直在想这件事，但事实上我错了。我检查了标准，似乎可以设计一个完全自定义的具有完全自定义指针类型的分配器，包括使用不同类型的指针、const指针、void指针和const void指针。因此，分配器实际上可以有一个比size_t大的size_type。

But to do so you need to actually define completely custom pointer types and the corresponding allocator and allocator traits instances.

但是要做到这一点，您需要实际定义完全自定义指针类型和相应的分配器和分配器特征实例。

The reason I say may is that I'm still a bit unclear if the size_type needs to span the size of the single object or also the size of multiple objects (that is an array) in the allocator model. I will need to investigate this detail (but not now, it's dinner time here :) )

我之所以说may，是因为我仍然不清楚size_type是需要跨越单个对象的大小，还是需要跨越分配器模型中的多个对象(即数组)的大小。我将需要调查这个细节(但现在不是晚餐时间:)

Edit2 (new addendum):

Edit2(新补充):

@larsmans I think you may want to decide what to accept anyway. The problem seems to be a little more complicated than one may intuitively realize. I'm editing the answer again as my thoughts are definitively more than a comment (both in content and in size).

@larsmans我想你可能想决定接受什么。这个问题似乎比人们直觉上意识到的要复杂一些。我正在重新编辑答案，因为我的想法肯定不止是评论(内容和大小)。

ReEdit (as pointed out in the comments the next two paragraphs are not correct):

重新编辑(如评论中所指出，以下两段不正确):

First of all size_type is just a name. You can of course define a container and add a size_type to it with whatever meaning you wish. Your size_type could be a float, a string whatever.

首先，size_type只是一个名称。当然，您可以定义一个容器，并添加一个size_type到它的任何含义，您希望。size_type可以是一个浮点数，字符串等等。

That said in standard library containers size_type is defined in the container only to make it easy to access. It's in fact supposed to be identical to the size_type of the allocator for that container (and the size_type of the allocator should be the size_type of the allotator_traits of that allocator).

在标准库容器中，size_type是在容器中定义的，只是为了方便访问。实际上，它应该与该容器的分配程序size_type相同(分配程序的size_type应该是该分配程序的allotator_traits的size_type)。

Therefore we shall henceforth assume that the size_type of the container, even one you define, follows the same logic 'by convention'. @BenVoight begins his answer with "As @AnalogFile explains, no allocated memory can be larger than size_t. So a container which inherits its size_type from an allocator cannot have size_type larger than size_t.". In fact we are now stipulating that if a container has a size_type then that comes from the allocator (he says inherit, but that of course is not in the common sense of class inheritance).

因此，我们将假设容器的size_type(即使是您定义的类型)遵循相同的逻辑“by convention”。@BenVoight以“正如@ analofile解释的那样，任何分配的内存都不能大于size_t。”因此，从分配器继承size_type的容器不能拥有比size_t大的size_type。实际上，我们现在规定，如果容器有一个size_type，那么它来自于分配程序(他说继承，但这当然不是在普通的类继承的意义上)。

However he may or may not be 100% right that a size_type (even if it comes from an allocator) is necessarily constrained to size_t. The question really is: can an allocator (and the corresponding traits) define a size_type that is larger than size_t?

但是，他可能对size_type(即使它来自一个分配器)一定被限制为size_t有100%的把握，也可能不是100%正确的。问题是:分配程序(以及相应的特性)是否定义了大于size_t的size_type ?

Both @BenVoight and @ecatmur suggest a usecase where the backing store is a file. However if the backing store is a file only for the content and you have something in memory that refers to that content (let's call that an 'handle'), then you are in fact doing a container that contains handles. A handle will be an instance of some class that stores the actual data on a file and only keeps in memory whatever it needs to retrieve that data, but this is irrelevant to the container: the container will store the handles and those are in memory and we still are in the 'normal' address space, so my initial response is still valid.

@BenVoight和@ecatmur两个名字都暗示了一个usecase，在那里后台存储是一个文件。但是，如果后台存储只是内容的一个文件，并且内存中有引用该内容的内容(我们称其为“句柄”)，那么实际上您正在执行一个包含句柄的容器。处理将一些类的实例存储实际数据文件,只保持在内存中无论它需要检索数据,但这是无关紧要的容器:容器将存储在内存中处理这些和我们仍然是“正常”的地址空间,所以我的最初反应是仍然有效。

There is another case, however. You are not allocating handles, you are actually storing stuff in the file (or database) and your allocator (and relative traits) define pointer, const pointer, void pointer, const void pointer etc. types that directly manage that backing store. In this case, of course, they also need to define the size_type (replacing size_t) and difference_type (replacing ptrdiff_t) to match.

然而，还有另一种情况。您没有分配句柄，实际上是在文件(或数据库)中存储内容，而您的分配器(和相关特性)定义了指针、const指针、void指针、const void指针等类型，这些类型直接管理后台存储。当然，在本例中，他们还需要定义size_type(替换size_t)和difference_type(替换ptrdiff_t)以匹配。

The direct difficulties in defining size_type (and difference_type) as larger than size_t when size_t is already as large as the largest implementation provided primitive integral type (if not, then there are no difficulties) are related to the fact that they need to be integer types.

当size_t已经与提供的最大实现提供的原始整型类型(如果没有，那么就没有困难)一样大时，将size_type(和difference_type)定义为比size_t大的直接困难与它们需要是整型类型有关。

Depending on how you interpret the standard this may be impossible (because according to the standard integer types are the types defined in the standard plus the extended integer types provided by the implementation) or possible (if you interpret it such that you can provide an extended integer type yourself) as long as you can write a class that behaves exactly like an primitive type. This was impossible in the old times (overloading rules did make primitive types always distinguishable from user defined types), but I'm not 100% up-to-date with C++11 and this may (or may not be changed).

取决于你如何解释标准,这可能是不可能的(因为根据标准的整数类型中定义的类型提供的标准+扩展整数类型实现)或可能的(如果你解释它,这样你可以自己提供了一个扩展的整数类型)只要你可以编写一个类,它的行为就像一个原始类型。在过去，这是不可能的(重载规则确实使原始类型与用户定义的类型总是不同)，但是我并不是100%地更新c++ 11，这可能(也可能不会更改)。

However there are also indirect difficulties. You not only need to provide a suitable integer type for size_type. You also need to provide the rest of the allocator interface.

但是也有间接的困难。您不仅需要为size_type提供合适的整数类型。您还需要提供分配器接口的其余部分。

I've been thinking about it a little and one problem I see is in implementing *p according to 17.6.3.5. In that *p syntax p is a pointer as typed by the allocator traits. Of course we can write a class and define an operator* (the nullary method version, doing pointer dereferece). And one may think that this can be easily done by 'paging in' the relative part of the file (as @ecatmur suggests). However there's a problem: *p must be a T& for that object. Therefore the object itself must fit in memory and, more importantly, since you may do T &ref = *p and hold that reference indefinitely, once you have paged in the data you will never be allowed to page it out any more. This means that effectively there may be no way to properly implement such an allocator unless the whole backing store can also be loaded into memory.

我一直在思考这个问题，我看到的一个问题是根据17.6.3.5实现*p。在该*p语法中，p是一个由分配器特性键入的指针。当然，我们可以编写一个类并定义一个操作符* (nullary方法版本，做指针dereferece)。人们可能会认为，通过“分页”文件的相对部分(就像@ecatmur建议的那样)可以很容易地做到这一点。但是有一个问题:*p必须是该对象的一个T&。因此，对象本身必须适合于内存，更重要的是，由于您可以执行T &ref = *p并无限期地持有该引用，所以一旦您在数据中分页之后，您将永远不允许再分页。这意味着，除非整个后备存储器也能被加载到内存中，否则可能无法正确地实现这样的分配器。

Those are my early observations and seem to actually confirm my first impression that the real answer is no: there is no practical way to do it.

这些是我早期的观察结果，似乎证实了我的第一印象:真正的答案是否定的:没有切实可行的方法。

However, as you see, things are much more complicated than mere intuition seems to suggest. It may take quite a time to find a definitive answer (and I may or may not go ahead and research the topic further).

然而，正如你所看到的，事情比单纯的直觉所暗示的要复杂得多。找到一个明确的答案可能需要很长时间(我可能会也可能不会继续深入研究这个话题)。

For the moment I'll just say: it seems not to be possible. Statements to the contrary shall only be acceptable if they are not based solely on intuition: post code and let people debate if your code fully conforms to 17.6.3.5 and if your size_type (which shall be larger than size_t even if size_t is as large as the largest primitive integer type) can be considered an integer type.

现在我只想说:这似乎是不可能的。相反的声明须接受如果不完全基于直觉:邮政编码,让人们辩论如果代码完全符合17.6.3.5如果你size_type(应大于size_t即使size_t一样大最大的原始整数类型)可以被认为是一个整数类型。

#3

Yes and no.

是的,没有。

As @AnalogFile explains, no allocated memory can be larger than size_t. So a container which inherits its size_type from an allocator cannot have size_type larger than size_t.

正如@类比文件所解释的，没有分配的内存可以大于size_t。因此，从分配程序继承size_type的容器不能拥有比size_t大的size_type。

However, you can design a container type which represents a collection not entirely stored in addressable memory. For example, the members could be on disk or in a database. They could even be computed dynamically, e.g. a Fibonacci sequence, and never stored anywhere at all. In such cases, size_type could easily be larger than size_t.

但是，您可以设计一个容器类型，它表示不完全存储在可寻址内存中的集合。例如，成员可以在磁盘或数据库中。它们甚至可以被动态地计算，比如斐波那契数列，而不会被存储在任何地方。在这种情况下，size_type很容易比size_t大。

#4

I'm sure its buried in the standard somewhere, but the best description i've seen for size_type is from the SGI-STL documentation. As I said, i'm sure it is in the standard, and if someone can point it out, by all means do.

我确信它隐藏在某个地方的标准中，但是我看到的关于size_type的最好描述来自SGI-STL文档。就像我说的，我确信它在标准中，如果有人能指出来，那就一定要去做。

According to SGI, a container's size_type is:

根据SGI，容器的size_type是:

An unsigned integral type that can represent any nonnegative value of the container's distance type

可以表示容器距离类型的任何非负值的无符号整型

It makes no claims that is must be anything besides that. In theory you could define a container that uses uint64_t, unsigned char, and anything else in between. That it is referencing the container's distance_type is the part I find interesting, since...

它没有说“是”一定是除此之外的任何东西。理论上，您可以定义一个容器，该容器使用uint64_t、无符号字符以及中间的任何内容。它引用容器的distance_type是我感兴趣的部分，因为…

distance_type: A signed integral type used to represent the distance between two of the container's iterators. This type must be the same as the iterator's distance type.

distance_type:一个有符号的整型类型，用于表示容器的两个迭代器之间的距离。这种类型必须与迭代器的距离类型相同。

This doesn't really answer the question, though, but it is interesting to see how size_type and size_t differ (or can). Regarding your question, see (and up vote) @AnalogFile s answer, as I believe it to be correct.

虽然这并不能真正回答这个问题，但是看看size_type和size_t有什么不同(或者can)是很有趣的。关于你的问题，请参阅(并向上投票)@ analofile的答案，我相信它是正确的。

#5

From §18.2/6

从§18.2/6

The type size_t is an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object.

类型size_t是一个实现定义的无符号整数类型，它的大小足以容纳任何对象的字节大小。

So, if it were possible for you to allocate an object whose size cannot be represented by a size_t it would make the implementation non-conforming.

因此，如果您可以分配一个对象，其大小不能用size_t表示，那么它将使实现不符合要求。

#6

To add to the "standard" answers, also note the stxxl project which is supposed to be able to handle terabytes of data using disk storage (perhaps by extension, network storage). See the header of vector for example, for the definition of size_type (line 731, and line 742) as uint64.

要添加“标准”答案，还需要注意stxxl项目，该项目应该能够使用磁盘存储(可能通过扩展、网络存储)处理tb级的数据。例如，请参阅vector的header，获取size_type(第731行和第742行)作为uint64的定义。

This is a concrete example of using containers with larger sizes than memory can afford, or that even the system's integer can handle.

这是一个具体的例子，使用容量大于内存的容器，甚至系统的整数都可以处理。

#7

Not necessarily.

不一定。

I assume by size_type you mean the typedef inside most STL containers?

我猜size_type你指的是大多数STL容器中的类型定义?

If so, then just because size_type was added to all the containers instead of just using size_t means that the STL is reserving the right to make size_type any type they like. (By default, in all implementations I'm aware of size_type is a typedef of size_t).

如果是这样，那么只是因为size_type被添加到所有的容器中而不是使用size_t，这意味着STL保留了使size_type为其所喜欢的任何类型的权利。(默认情况下，在所有实现中，我都知道size_type是size_t的类型定义)。

#1