如何访问std :: vector的内部连续缓冲区,我可以将它与memcpy等一起使用吗?

时间:2022-03-07 15:01:05

How can I access the contiguous memory buffer used within a std::vector so I can perform direct memory operations on it (e.g. memcpy)? Also, it is safe to perform operations like memcpy on that buffer?

如何访问std :: vector中使用的连续内存缓冲区,以便对其执行直接内存操作(例如memcpy)?此外,在缓冲区上执行memcpy等操作是安全的吗?

I've read that the standard guarantees that a vector uses a contiguous memory buffer internally, but that it is not necessarily implemented as a dynamic array. I figure given that it is definitely contiguous, I should be able to use it as such - but I wasn't sure if the vector implementation stored book-keeping data as part of that buffer. If it did, then something like memcpying the vector buffer would destroy its internal state.

我已经读过标准保证向量在内部使用连续的内存缓冲区,但它不一定是作为动态数组实现的。我认为它肯定是连续的,我应该可以这样使用它 - 但我不确定向量实现是否将簿记数据存储为该缓冲区的一部分。如果确实如此,那么像memcpying向量缓冲区这样的东西就会破坏它的内部状态。

7 个解决方案

#1


13  

In practice, virtually all compilers implement vector as an array under the hood. You can get a pointer to this array by doing &somevector[0]. If the contents of the vector are POD ('plain-old-data') types, doing memcpy should be safe - however if they're C++ classes with complex initialization logic, you'd be safer using std::copy.

实际上,几乎所有编译器都将矢量作为阵列实现。您可以通过执行&somevector [0]获取指向此数组的指针。如果向量的内容是POD('plain-old-data')类型,那么执行memcpy应该是安全的 - 但是如果它们是具有复杂初始化逻辑的C ++类,那么使用std :: copy会更安全。

#2


9  

Simply do

&vec[0];

// or Goz's suggestion:
&vec.front();

// or
&*vec.begin();
// but I don't know why you'd want to do that

This returns the address of the first element in the vector (assuming vec has more than 0 elements), which is the address of the array it uses. vector storage is guaranteed by the standard to be contiguous, so this is a safe way to use a vector with functions that expect arrays.

这将返回向量中第一个元素的地址(假设vec具有多于0个元素),这是它使用的数组的地址。矢量存储由标准保证是连续的,因此这是一种使用具有期望数组的函数的向量的安全方法。

Be aware that if you add, or remove elements from the vector, or [potentially] modify the vector in any way (such as calling reserve), this pointer could become invalid and point to a deallocated area of memory.

请注意,如果您在向量中添加或删除元素,或者[可能]以任何方式修改向量(例如调用reserve),则此指针可能变为无效并指向已释放的内存区域。

#3


6  

You can simply do:

你可以简单地做:

  &vect[0]

The memory is guaranteed contiguous so its safe to work with it with C library functions such as memcpy. However, you shouldn't persist pointers into the contiguous data because vector resizes may reallocate and copy the memory to a different location. IE the following would be bad:

内存保证是连续的,因此使用诸如memcpy之类的C库函数可以安全地使用它。但是,您不应该将指针持久保存到连续数据中,因为向量调整大小可能会重新分配并将内存复制到其他位置。以下IE会很糟糕:

  std::vector<char> charVect;
  // insert a bunch of stuff into charVect
  ...
  char* bufferPtr = &charVect[0];
  charVect.push_back('a'); // potential resize
  // Now bufferPtr may not be valid since the resize may have moved
  // the vectors contents
  bufferPtr[0] = 'f'; // **CRASH**

#4


2  

&myvec[0]

But note that using memcpy is really only applicable if this is a vector of PODs or primitive types. Doing direct memory manipulation of anything else leads to undefined behaviour.

但请注意,使用memcpy实际上只适用于POD或基本类型的向量。对其他任何内容进行直接内存操作会导致未定义的行为。

#5


2  

The simplest way is to use &v[0], where v is your vector. An example:

最简单的方法是使用&v [0],其中v是你的向量。一个例子:

int write_vector(int fd, const std::vector<char>& v) {
   int rval = write(fd, &v[0], v.size());
   return rval;
}

#6


1  

Yes - since the standard guarantees contiguous placement of the vector's internal data, you can access a pointer to the first element in the vector via:

是 - 由于标准保证了矢量内部数据的连续放置,您可以通过以下方式访问向量中第一个元素的指针:

std::vector<int> my_vector;
// initialize...

int* arr = &my_vector[0];

#7


0  

While you can safely read the right amount of data from the underlying storage, writing there may not happen to be a good idea, depending on the design.

虽然您可以安全地从底层存储中读取适量的数据,但根据设计的不同,写入可能并不是一个好主意。

    vlad:Code ⧴ cat vectortest.cpp
    #include <vector>
    #include <iostream>
    int main()
    {
        using namespace std;
        vector<char> v(2);
        v.reserve(10);
        char c[6]={ 'H', 'e', 'l', 'l', 'o', '\0' };
        cout << "Original size is " << v.size();
        memcpy( v.data(), c, 6);
        cout << ", after memcpy it is " << v.size();
        copy(c, c+6, v.begin());
        cout << ", and after copy it is " << v.size() << endl;
        cout << "Arr: " << c << endl;
        cout << "Vec: ";
        for (auto i = v.begin(); i!=v.end(); ++i) cout << *i;
        cout << endl;
    }
    vlad:Code ⧴ make vectortest
    make: `vectortest' is up to date.
    vlad:Code ⧴ ./vectortest
    Original size is 2, after memcpy it is 2, and after copy it is 2
    Arr: Hello
    Vec: He
    vlad:Code ⧴

So if you are writing past the size(), then the new data is not accessible by class methods.

因此,如果您正在编写大小(),那么类方法无法访问新数据。

You can account for that and ensure the size is enough (e.g. vector<char> v(10)), but do you really want to make software where you are fighting the standard library?

你可以考虑到这一点,并确保大小足够(例如vector v(10)),但你真的想制作你正在与标准库作斗争的软件吗?

#1


13  

In practice, virtually all compilers implement vector as an array under the hood. You can get a pointer to this array by doing &somevector[0]. If the contents of the vector are POD ('plain-old-data') types, doing memcpy should be safe - however if they're C++ classes with complex initialization logic, you'd be safer using std::copy.

实际上,几乎所有编译器都将矢量作为阵列实现。您可以通过执行&somevector [0]获取指向此数组的指针。如果向量的内容是POD('plain-old-data')类型,那么执行memcpy应该是安全的 - 但是如果它们是具有复杂初始化逻辑的C ++类,那么使用std :: copy会更安全。

#2


9  

Simply do

&vec[0];

// or Goz's suggestion:
&vec.front();

// or
&*vec.begin();
// but I don't know why you'd want to do that

This returns the address of the first element in the vector (assuming vec has more than 0 elements), which is the address of the array it uses. vector storage is guaranteed by the standard to be contiguous, so this is a safe way to use a vector with functions that expect arrays.

这将返回向量中第一个元素的地址(假设vec具有多于0个元素),这是它使用的数组的地址。矢量存储由标准保证是连续的,因此这是一种使用具有期望数组的函数的向量的安全方法。

Be aware that if you add, or remove elements from the vector, or [potentially] modify the vector in any way (such as calling reserve), this pointer could become invalid and point to a deallocated area of memory.

请注意,如果您在向量中添加或删除元素,或者[可能]以任何方式修改向量(例如调用reserve),则此指针可能变为无效并指向已释放的内存区域。

#3


6  

You can simply do:

你可以简单地做:

  &vect[0]

The memory is guaranteed contiguous so its safe to work with it with C library functions such as memcpy. However, you shouldn't persist pointers into the contiguous data because vector resizes may reallocate and copy the memory to a different location. IE the following would be bad:

内存保证是连续的,因此使用诸如memcpy之类的C库函数可以安全地使用它。但是,您不应该将指针持久保存到连续数据中,因为向量调整大小可能会重新分配并将内存复制到其他位置。以下IE会很糟糕:

  std::vector<char> charVect;
  // insert a bunch of stuff into charVect
  ...
  char* bufferPtr = &charVect[0];
  charVect.push_back('a'); // potential resize
  // Now bufferPtr may not be valid since the resize may have moved
  // the vectors contents
  bufferPtr[0] = 'f'; // **CRASH**

#4


2  

&myvec[0]

But note that using memcpy is really only applicable if this is a vector of PODs or primitive types. Doing direct memory manipulation of anything else leads to undefined behaviour.

但请注意,使用memcpy实际上只适用于POD或基本类型的向量。对其他任何内容进行直接内存操作会导致未定义的行为。

#5


2  

The simplest way is to use &v[0], where v is your vector. An example:

最简单的方法是使用&v [0],其中v是你的向量。一个例子:

int write_vector(int fd, const std::vector<char>& v) {
   int rval = write(fd, &v[0], v.size());
   return rval;
}

#6


1  

Yes - since the standard guarantees contiguous placement of the vector's internal data, you can access a pointer to the first element in the vector via:

是 - 由于标准保证了矢量内部数据的连续放置,您可以通过以下方式访问向量中第一个元素的指针:

std::vector<int> my_vector;
// initialize...

int* arr = &my_vector[0];

#7


0  

While you can safely read the right amount of data from the underlying storage, writing there may not happen to be a good idea, depending on the design.

虽然您可以安全地从底层存储中读取适量的数据,但根据设计的不同,写入可能并不是一个好主意。

    vlad:Code ⧴ cat vectortest.cpp
    #include <vector>
    #include <iostream>
    int main()
    {
        using namespace std;
        vector<char> v(2);
        v.reserve(10);
        char c[6]={ 'H', 'e', 'l', 'l', 'o', '\0' };
        cout << "Original size is " << v.size();
        memcpy( v.data(), c, 6);
        cout << ", after memcpy it is " << v.size();
        copy(c, c+6, v.begin());
        cout << ", and after copy it is " << v.size() << endl;
        cout << "Arr: " << c << endl;
        cout << "Vec: ";
        for (auto i = v.begin(); i!=v.end(); ++i) cout << *i;
        cout << endl;
    }
    vlad:Code ⧴ make vectortest
    make: `vectortest' is up to date.
    vlad:Code ⧴ ./vectortest
    Original size is 2, after memcpy it is 2, and after copy it is 2
    Arr: Hello
    Vec: He
    vlad:Code ⧴

So if you are writing past the size(), then the new data is not accessible by class methods.

因此,如果您正在编写大小(),那么类方法无法访问新数据。

You can account for that and ensure the size is enough (e.g. vector<char> v(10)), but do you really want to make software where you are fighting the standard library?

你可以考虑到这一点,并确保大小足够(例如vector v(10)),但你真的想制作你正在与标准库作斗争的软件吗?