哪个更有效,设置或矢量

时间:2022-08-20 12:51:53

I have a bit of an issue, I was recently told that for an un-ordered value for input, a bunch of random values, lets say 1 Million of them, that using a set would be more efficient than using a vector, and then sorting said vector with the basic sort algorithm function, but when I used them, and checked them through the time function, in the terminal, and valgrind, it showed that both time complexity, and space usage were faster for the vector, even with the addition of the sort function being called. The person who gave me the advice to use the set is a lot more experienced than me in the C++ language, but I always have to test things out myself prior to taking peoples advice. The test codes follow.

我有一个问题,我最近被告知,对于一个无序的输入值,一堆随机值,比如100万个,使用一个集合比使用一个向量更有效,然后使用基本排序算法函数对所述向量进行排序,但是当我使用它们并通过时间函数检查它们时,在终端和valgrind中,它表明向量的时间复杂度和空间使用都更快,即使使用添加要调用的sort函数。给我建议使用该套件的人在C ++语言中比我经验丰富,但在接受人们的建议之前我总是要自己测试一下。测试代码如下。

For Set

std::set<int> testSet;
  for(int i(0); i<= 1000000; ++i)
    testSet.insert(-i);

For Vector

 std::vector<int> testVector;
  for(int i(0); i<= 1000000; ++i)
    testVector.push_back(i * -1);

  std::sort(testVector.begin(), testVector.end());

I know that these are not random variables, it wouldn't be fair since set does not allow duplicates, and vector does sothey would be different sizes for this basic function point. Can anyone clarify why the set should be used, sans the point of the no duplicates one.

我知道这些不是随机变量,它不公平,因为set不允许重复,而vector对于这个基本功能点来说是不同的大小。任何人都可以澄清为什么应该使用该集合,没有重复的一点。

I did not do any tests with the unordered set either. Not too sure of the differences between the two given points.

我也没有对无序集进行任何测试。不太确定两个给定点之间的差异。

1 个解决方案

#1


6  

This is too vague and ignores/misses out several crucial factors. If your friend said precisely this, then your friend (regardless of his or her experience) was wrong. More likely you are somewhat misinterpreting their words and reading into them a simplified version of matters.

这太模糊了,忽略/忽略了几个关键因素。如果你的朋友正是这样说的,那么你的朋友(无论他或她的经历)是错的。更有可能的是,你在某种程度上误解了他们的话,并在其中阅读了简化版本的问题。

When you want a sorted final product, the sorting is "amortized" when you insert into a set, because you get little bits of sorting action each time. If you will be inserting periodically and many times, then that spreading-out of the workload may be what you want. The total, when added up, may still be more than for a vector (consider the occasional rebalancing and so forth; your vector just needs to be moved to a larger block of memory once in a while), but you've spread it out so as not to noticeably slow down some individual other part of your program.

如果需要已排序的最终产品,则在插入集合时,排序将“分摊”,因为每次都会获得一些排序操作。如果您要定期插入多次,那么工作负载的扩展可能就是您想要的。加起来的总数可能仍然超过向量(考虑偶尔的重新平衡等等;你的向量只需要偶尔移动到更大的内存块),但是你已经将它展开了以免显着减慢程序中某些其他部分的速度。

But if you're just dumping all the elements into a vector and sorting straight away, not only is there less work for the container & algorithm to do but you probably don't mind it taking a noticeable amount of time.

但是如果你只是将所有元素都转储到一个向量中并直接排序,那么容器和算法不仅没有那么多工作要做,而且你可能不介意花费大量的时间。

You haven't really stated your use case in any detail so I won't pretend to give specifics here, but the only possible answer to your question as posed is both "it depends" and "the question is fundamentally somewhat meaningless"; you cannot just take two data structures and sorting methodologies, and ask "which is more efficient?" without a use case. You have, however, correctly measured the time and space requirements and if you've done that against your real-world use case then, well, you have your answer don't you?

你没有真正详细说明你的用例,所以我不会假装在这里给出具体细节,但你提出的问题唯一可能的答案是“它取决于”和“这个问题从根本上说是毫无意义”;你不能只采取两种数据结构和排序方法,并问“哪种更有效?”没有用例。但是,您已经正确地测量了时间和空间要求,如果您已经针对现实世界的用例进行了测试,那么,您有答案,不是吗?

#1


6  

This is too vague and ignores/misses out several crucial factors. If your friend said precisely this, then your friend (regardless of his or her experience) was wrong. More likely you are somewhat misinterpreting their words and reading into them a simplified version of matters.

这太模糊了,忽略/忽略了几个关键因素。如果你的朋友正是这样说的,那么你的朋友(无论他或她的经历)是错的。更有可能的是,你在某种程度上误解了他们的话,并在其中阅读了简化版本的问题。

When you want a sorted final product, the sorting is "amortized" when you insert into a set, because you get little bits of sorting action each time. If you will be inserting periodically and many times, then that spreading-out of the workload may be what you want. The total, when added up, may still be more than for a vector (consider the occasional rebalancing and so forth; your vector just needs to be moved to a larger block of memory once in a while), but you've spread it out so as not to noticeably slow down some individual other part of your program.

如果需要已排序的最终产品,则在插入集合时,排序将“分摊”,因为每次都会获得一些排序操作。如果您要定期插入多次,那么工作负载的扩展可能就是您想要的。加起来的总数可能仍然超过向量(考虑偶尔的重新平衡等等;你的向量只需要偶尔移动到更大的内存块),但是你已经将它展开了以免显着减慢程序中某些其他部分的速度。

But if you're just dumping all the elements into a vector and sorting straight away, not only is there less work for the container & algorithm to do but you probably don't mind it taking a noticeable amount of time.

但是如果你只是将所有元素都转储到一个向量中并直接排序,那么容器和算法不仅没有那么多工作要做,而且你可能不介意花费大量的时间。

You haven't really stated your use case in any detail so I won't pretend to give specifics here, but the only possible answer to your question as posed is both "it depends" and "the question is fundamentally somewhat meaningless"; you cannot just take two data structures and sorting methodologies, and ask "which is more efficient?" without a use case. You have, however, correctly measured the time and space requirements and if you've done that against your real-world use case then, well, you have your answer don't you?

你没有真正详细说明你的用例,所以我不会假装在这里给出具体细节,但你提出的问题唯一可能的答案是“它取决于”和“这个问题从根本上说是毫无意义”;你不能只采取两种数据结构和排序方法,并问“哪种更有效?”没有用例。但是,您已经正确地测量了时间和空间要求,如果您已经针对现实世界的用例进行了测试,那么,您有答案,不是吗?