C ++浅层和深层复制 - 反映向量的num_items中的变化

时间:2021-03-10 04:26:23

I'm currently undertaking a C++ course at university. I understand the general concept of shallow and deep copying using vectors however there's an example in my textbook that has me confused.

我目前正在大学攻读C ++课程。我理解使用向量进行浅层和深层复制的一般概念,但是在我的教科书中有一个让我感到困惑的例子。

Please assume that it is a poorly implemented vector with no copy constructor defined so that it only performs a shallow copy of the data.

请假设它是一个实现不良的向量,没有定义复制构造函数,因此它只执行数据的浅表副本。

I understand what's happening in the first part

我理解第一部分发生了什么

In the statement

在声明中

vector<int> v2(v1);

vector v1 is passed as a const reference argument to the vector copy constructor, so v1 can’t be changed, and the variable v2 is then initialized to a copy of the vector v1. Each data field will be copied, and any changes made later to v2 should not affect v1. When the value in v1.the_data is copied over, both v1.the_data and v2.the_data will point to the same array

向量v1作为const引用参数传递给向量复制构造函数,因此无法更改v1,然后将变量v2初始化为向量v1的副本。将复制每个数据字段,稍后对v2所做的任何更改都不应影响v1。复制v1.the_data中的值时,v1.the_data和v2.the_data都将指向同一个数组

Because v1.the_data and v2.the_data point to the same object, the statement

因为v1.the_data和v2.the_data指向同一个对象,所以声明

v1[2] = 10;

also changes v2[2]. For this reason, v2 is considered a shallow copy of v1.

也改变了v2 [2]。因此,v2被认为是v1的浅表副本。

However I'm struggling to understand this part. I'm not quite sure why v2.num_items won't also change in a shallow copy.

但是我很难理解这一部分。我不太清楚为什么v2.num_items也不会在浅拷贝中改变。

The statement

v1.push_back(20);

will insert 20 into v1[5] and will change v1.num_items to 6, but will not change v2.num_items.

将插入20到v1 [5]并将v1.num_items更改为6,但不会更改v2.num_items。

My current thoughts on it are that v1.the_data and v2.the_data are pointing to the same place in memory therefore they 'share' the same vector so that when 20 is added to the end of it both of the vectors should gain an additional integer.

我目前的想法是v1.the_data和v2.the_data指向内存中的相同位置,因此它们“共享”相同的向量,这样当20添加到它的末尾时,两个向量应该获得一个额外的整数。

I would greatly appreciate assistance in understanding why the number of items won't change for v2 when v1 is modified.

我非常感谢帮助理解为什么修改v1时v2的项目数量不会改变。

4 个解决方案

#1


2  

Assuming we are talking about the standard std::vector :

假设我们正在谈论标准的std :: vector:

When you copy the vector in this statement :

在此语句中复制向量时:

vector<int> v2(v1);

v2 is built by copying each element of v1. v1 and v2 do not share any of their memory.

v2是通过复制v1的每个元素构建的。 v1和v2不共享任何内存。

This part :

这部分 :

both v1.the_data and v2.the_data will point to the same array

v1.the_data和v2.the_data都将指向同一个数组

Because v1.the_data and v2.the_data point to the same object,

因为v1.the_data和v2.the_data指向同一个对象,

Is wrong.

You can convince yourself by comparing the underlying arrays addresses of each of your vectors with the data() member function.

您可以通过将每个向量的基础数组地址与data()成员函数进行比较来说服自己。

EDIT :

Assuming you are crazy enough to not use std::vector and use an implementation that would "share" its back end array when copied (I wont talk about the issues with this design : who owns the array ? who delete[] it ?)

假设你疯狂到不使用std :: vector并使用一个在复制时“共享”其后端数组的实现(我不会谈论这个设计的问题:谁拥有数组?谁删除了[]呢?)

The issue raised by your teacher is that when v1 is modified (e.g. an element is added), v2 does not know about it, and has an unchanged size.

您的老师提出的问题是,当修改v1时(例如添加了一个元素),v2不知道它,并且大小不变。

Any push_back (or the likes) made to one vector should be observed by every other owner of the array, to properly reflect the size of the array.

对阵列的每个其他所有者都应观察对一个向量进行的任何push_back(或类似),以正确反映阵列的大小。

Either :

1) you implement some kind of observer pattern to have each vector aware of any modification (and it is more difficult than it sounds)

1)你实现了某种观察者模式,让每个向量都知道任何修改(并且它比听起来更困难)

2) you use tricks to store the length in the backend array itself.

2)你使用技巧来存储后端数组本身的长度。

You would run into similar issues to invalid every iterators when the "shared" array is modified through one of the vectors references... A nightmare ! There are good reasons why the STL containers were all designed to managed their own memory, hence always providing deep copy semantics.

当通过其中一个向量引用修改“共享”数组时,你会遇到类似的问题来使每个迭代器无效......这是一场噩梦!有很好的理由说明为什么STL容器都是为了管理自己的内存而设计的,因此总是提供深度复制语义。

#2


2  

Is your textbook talking about std::vector from the standard library? If so, it is wrong. vector<int> v2(v1); copy constructs v2 from v1. This is a deep copy, the two containers don't share storage and are completely separate.

你的教科书是否在谈论标准库中的std :: vector?如果是这样,那就错了。 vector v2(v1);从v1复制构造v2。这是一个深层副本,两个容器不共享存储并且是完全独立的。

If, instead, this is a badly implemented vector class and the containers share storage then changing an existing element in one will be reflected in the other. An operation like push_back that changed one container's num_items but not the other's would cause them to disagree on their size.

相反,如果这是一个执行得很糟糕的矢量类,并且容器共享存储,那么在一个中更改现有元素将反映在另一个中。像push_back这样改变一个容器的num_items但不改变另一个容器的操作会导致它们对它们的大小不一致。

#3


2  

The statement seems to assume a particular implementation of vector (which is not conform with std::vector). Suppose, for example, we have a very naïve implementation:

该语句似乎假设了vector的特定实现(它不符合std :: vector)。例如,假设我们有一个非常天真的实现:

template <typename T>
class Vector
{
    T* myData;
    int mySize;
    int myCapacity;
public:
    void push_back( T const& newValue )
    {
        if ( mySize == myCapacity ) {
            //  Extend the capacity...
        }
        myData[mySize] = newValue;
        ++ mySize;
    }
    T& operator[]( int index )
    {
        return myData[index];
    }
};

If you don't have a copy constructor, when you copy the vector, all three variables will end up the same: both vectors will have a pointer to the same data, the same size and the same capacity. But these are copies: when you use [], you modify the memory pointed to by myData, which is the same in both vectors; when you do the push_back on v1, you update the size of v1, in its local copy of the size.

如果您没有复制构造函数,则在复制向量时,所有三个变量将以相同的方式结束:两个向量将具有指向相同数据,相同大小和相同容量的指针。但这些是副本:当你使用[]时,你修改了myData指向的内存,这在两个向量中是相同的;当您在v1上执行push_back时,可以在其大小的本地副本中更新v1的大小。

Of course, this implementation is naïve in a lot of ways. A good implementation of something like std::vector requires a fair amount of thought, not just because if requires deep copy semantics, but also for reasons of exception safety (the constructor of T might throw), and to avoid imposing unnecessary requirements (in particular, a default constructor).

当然,这种实现在很多方面都是天真的。像std :: vector这样的好的实现需要相当多的思考,不仅仅是因为如果需要深度复制语义,而且还出于异常安全的原因(T的构造函数可能抛出),并避免强加不必要的要求(在特别是,默认构造函数)。

Also, if I were trying to use a poorly implemented vector as an example of shallow copy, I wouldn't call it vector, since that immediately conjures up the image of std::vector, which shouldn't be poorly implemented (and isn't in the library implementations I know).

此外,如果我试图使用一个实现不好的矢量作为浅拷贝的例子,我不会称之为矢量,因为它会立即让人想起std :: vector的图像,这应该不会很难实现(并且不是在我知道的库实现中。

#4


0  

The problem in understanding the statement in this question is whether when we have to consider vector as std::vector or as a theoretical implementation. std::vector doesn't allow shallow copying and the reason is given in the statement: the invariants can't be respected because of this.

理解这个问题中的陈述的问题在于我们是否必须将vector视为std :: vector或理论实现。 std :: vector不允许浅拷贝,原因在语句中给出:由于这个原因,不变量不能被尊重。

Now take the theoretical implementation with "the_data" and "num_items" members. Here copying the vector should give a deep copy, but just copying "the_data" gives a shallow copy because only a pointer is copied. This gives several issues: adapting the actual data in one vector will result in an inconsistent state in the other and memory management can't be done anymore.

现在使用“the_data”和“num_items”成员进行理论实现。在这里复制矢量应该给出一个深拷贝,但只是复制“the_data”给出一个浅拷贝,因为只复制了一个指针。这给出了几个问题:在一个向量中调整实际数据将导致另一个向量中的状态不一致,并且无法再进行内存管理。

#1


2  

Assuming we are talking about the standard std::vector :

假设我们正在谈论标准的std :: vector:

When you copy the vector in this statement :

在此语句中复制向量时:

vector<int> v2(v1);

v2 is built by copying each element of v1. v1 and v2 do not share any of their memory.

v2是通过复制v1的每个元素构建的。 v1和v2不共享任何内存。

This part :

这部分 :

both v1.the_data and v2.the_data will point to the same array

v1.the_data和v2.the_data都将指向同一个数组

Because v1.the_data and v2.the_data point to the same object,

因为v1.the_data和v2.the_data指向同一个对象,

Is wrong.

You can convince yourself by comparing the underlying arrays addresses of each of your vectors with the data() member function.

您可以通过将每个向量的基础数组地址与data()成员函数进行比较来说服自己。

EDIT :

Assuming you are crazy enough to not use std::vector and use an implementation that would "share" its back end array when copied (I wont talk about the issues with this design : who owns the array ? who delete[] it ?)

假设你疯狂到不使用std :: vector并使用一个在复制时“共享”其后端数组的实现(我不会谈论这个设计的问题:谁拥有数组?谁删除了[]呢?)

The issue raised by your teacher is that when v1 is modified (e.g. an element is added), v2 does not know about it, and has an unchanged size.

您的老师提出的问题是,当修改v1时(例如添加了一个元素),v2不知道它,并且大小不变。

Any push_back (or the likes) made to one vector should be observed by every other owner of the array, to properly reflect the size of the array.

对阵列的每个其他所有者都应观察对一个向量进行的任何push_back(或类似),以正确反映阵列的大小。

Either :

1) you implement some kind of observer pattern to have each vector aware of any modification (and it is more difficult than it sounds)

1)你实现了某种观察者模式,让每个向量都知道任何修改(并且它比听起来更困难)

2) you use tricks to store the length in the backend array itself.

2)你使用技巧来存储后端数组本身的长度。

You would run into similar issues to invalid every iterators when the "shared" array is modified through one of the vectors references... A nightmare ! There are good reasons why the STL containers were all designed to managed their own memory, hence always providing deep copy semantics.

当通过其中一个向量引用修改“共享”数组时,你会遇到类似的问题来使每个迭代器无效......这是一场噩梦!有很好的理由说明为什么STL容器都是为了管理自己的内存而设计的,因此总是提供深度复制语义。

#2


2  

Is your textbook talking about std::vector from the standard library? If so, it is wrong. vector<int> v2(v1); copy constructs v2 from v1. This is a deep copy, the two containers don't share storage and are completely separate.

你的教科书是否在谈论标准库中的std :: vector?如果是这样,那就错了。 vector v2(v1);从v1复制构造v2。这是一个深层副本,两个容器不共享存储并且是完全独立的。

If, instead, this is a badly implemented vector class and the containers share storage then changing an existing element in one will be reflected in the other. An operation like push_back that changed one container's num_items but not the other's would cause them to disagree on their size.

相反,如果这是一个执行得很糟糕的矢量类,并且容器共享存储,那么在一个中更改现有元素将反映在另一个中。像push_back这样改变一个容器的num_items但不改变另一个容器的操作会导致它们对它们的大小不一致。

#3


2  

The statement seems to assume a particular implementation of vector (which is not conform with std::vector). Suppose, for example, we have a very naïve implementation:

该语句似乎假设了vector的特定实现(它不符合std :: vector)。例如,假设我们有一个非常天真的实现:

template <typename T>
class Vector
{
    T* myData;
    int mySize;
    int myCapacity;
public:
    void push_back( T const& newValue )
    {
        if ( mySize == myCapacity ) {
            //  Extend the capacity...
        }
        myData[mySize] = newValue;
        ++ mySize;
    }
    T& operator[]( int index )
    {
        return myData[index];
    }
};

If you don't have a copy constructor, when you copy the vector, all three variables will end up the same: both vectors will have a pointer to the same data, the same size and the same capacity. But these are copies: when you use [], you modify the memory pointed to by myData, which is the same in both vectors; when you do the push_back on v1, you update the size of v1, in its local copy of the size.

如果您没有复制构造函数,则在复制向量时,所有三个变量将以相同的方式结束:两个向量将具有指向相同数据,相同大小和相同容量的指针。但这些是副本:当你使用[]时,你修改了myData指向的内存,这在两个向量中是相同的;当您在v1上执行push_back时,可以在其大小的本地副本中更新v1的大小。

Of course, this implementation is naïve in a lot of ways. A good implementation of something like std::vector requires a fair amount of thought, not just because if requires deep copy semantics, but also for reasons of exception safety (the constructor of T might throw), and to avoid imposing unnecessary requirements (in particular, a default constructor).

当然,这种实现在很多方面都是天真的。像std :: vector这样的好的实现需要相当多的思考,不仅仅是因为如果需要深度复制语义,而且还出于异常安全的原因(T的构造函数可能抛出),并避免强加不必要的要求(在特别是,默认构造函数)。

Also, if I were trying to use a poorly implemented vector as an example of shallow copy, I wouldn't call it vector, since that immediately conjures up the image of std::vector, which shouldn't be poorly implemented (and isn't in the library implementations I know).

此外,如果我试图使用一个实现不好的矢量作为浅拷贝的例子,我不会称之为矢量,因为它会立即让人想起std :: vector的图像,这应该不会很难实现(并且不是在我知道的库实现中。

#4


0  

The problem in understanding the statement in this question is whether when we have to consider vector as std::vector or as a theoretical implementation. std::vector doesn't allow shallow copying and the reason is given in the statement: the invariants can't be respected because of this.

理解这个问题中的陈述的问题在于我们是否必须将vector视为std :: vector或理论实现。 std :: vector不允许浅拷贝,原因在语句中给出:由于这个原因,不变量不能被尊重。

Now take the theoretical implementation with "the_data" and "num_items" members. Here copying the vector should give a deep copy, but just copying "the_data" gives a shallow copy because only a pointer is copied. This gives several issues: adapting the actual data in one vector will result in an inconsistent state in the other and memory management can't be done anymore.

现在使用“the_data”和“num_items”成员进行理论实现。在这里复制矢量应该给出一个深拷贝,但只是复制“the_data”给出一个浅拷贝,因为只复制了一个指针。这给出了几个问题:在一个向量中调整实际数据将导致另一个向量中的状态不一致,并且无法再进行内存管理。