哪种数据结构支持有效删除和随机访问?

时间:2021-04-09 13:29:05

I am looking for a data structure in which I can efficiently remove items and also supports random access.

我正在寻找一种数据结构,我可以有效地删除项目,并支持随机访问。

I also need efficient insertion but as the ordering of elements is not important, I thought I can preallocate memory for maximum number of elements it may have to store and then always put the new element at the end so that no reallocation or move of other elements is necessary.

我还需要有效的插入,但由于元素的排序并不重要,我认为我可以预先分配内存以获得它可能必须存储的最大数量的元素,然后始终将新元素放在最后,以便不重新分配或移动其他元素是必要的。

To the best of my knowledge, a linked list would be perfect for deleting but accessing its elements can take O(n) time. On the other side, a simple array (e.g vector in C++) has the random access property but deleting an element from such an structure has complexity of O(n).

据我所知,链表很适合删除,但访问其元素可能需要O(n)时间。另一方面,简单数组(例如C ++中的向量)具有随机访问属性,但是从这种结构中删除元素具有O(n)的复杂性。

Actually, the random access requirement is something stronger than what I really need. I only have to be able to pick an element of the structure uniformly at random. Clearly efficient access property implies efficiency of the operation I need but I am not sure if those two are equivalent.

实际上,随机访问要求比我真正需要的要强。我只需要随机均匀地选择结构的元素。显而易见的高效访问属性意味着我需要的操作效率,但我不确定这两者是否相同。

Thanks in advance!

提前致谢!

2 个解决方案

#1


5  

I believe the solution you hint at in your question is actually what you need, except for a small detail.

我相信你在问题中提示的解决方案实际上是你所需要的,除了一个小细节。

You suggested:

I thought I can preallocate memory for maximum number of elements it may have to store and then always put the new element at the end so that no reallocation or move of other elements is necessary.

我认为我可以预先分配内存以获得它可能必须存储的最大数量的元素,然后始终将新元素放在最后,这样就不需要重新分配或移动其他元素。

If indeed you can establish a reasonable maximum number of entries, then I would suggest you pre-allocate an array (e.g. using std::array if the maximum is known at compile time, or a std::vector otherwise) with that number of entries, establish an element count (to count the number of elements currently stored), and proceed as follows:

如果确实你可以建立一个合理的最大条目数,那么我建议你预先分配一个数组(例如,如果在编译时已知最大值,则使用std :: array,否则使用std :: vector)条目,建立元素计数(计算当前存储的元素数),并按如下方式进行:

  1. Initially you set the count to 0
  2. 最初将计数设置为0

  3. When you insert an element, you add it to the end and increment the count
  4. 插入元素时,将其添加到末尾并递增计数

  5. When you delete an element, you swap it with the last element and decrement the count
  6. 删除元素时,将其与最后一个元素交换并减少计数

  7. For random access (in the sense you described it, i.e. literally picking an element randomly) you determine a random number between 0 and count, and pick that element
  8. 对于随机访问(在你描述它的意义上,即字面上随机选取一个元素),你确定一个介于0和count之间的随机数,并选择该元素

The only detail I changed is element deletion, which I suggest you implement as switch positions with the last element.

我改变的唯一细节是元素删除,我建议你用最后一个元素实现切换位置。

Possible implementation:

#include <vector>
#include <utility>
#include <iostream>

template <typename Elem>
class randomaccesstable
{
public:
  randomaccesstable(std::size_t initial_size)
   : data_(initial_size) , count_(0)
  { }

  randomaccesstable &push_back(const Elem &elem)
  {
    if (count_ < data_.size())
      data_[count_++] = elem;
    else {
      data_.push_back(elem);
      ++count_;
    }
    return *this;
  }

  randomaccesstable &remove(const std::size_t index)
  {
    if (index < count_)
    {
      std::swap(data_[index],data_[count_-1]);
      --count_;
    }
    return *this;
  }

  const Elem &operator[](const std::size_t index) const
  { return data_[index]; }

  Elem &operator[](const std::size_t index)
  { return data_[index]; }

  std::size_t size() const
  { return count_; }

private:
  std::vector<Elem>  data_;
  std::size_t        count_;
};

int main()
{
  randomaccesstable<int> table(10);
  table.push_back(3);
  table.push_back(12);
  table.push_back(2);

  for (std::size_t i = 0 ; i < table.size() ; ++i)
    std::cout << table[i] << ' ';
  std::cout << '\n';

  table.remove(1);   // this removes the entry for 12, swapping it for 2

  for (std::size_t i = 0 ; i < table.size() ; ++i)
    std::cout << table[i] << ' ';
  std::cout << '\n';

  return 0;
}

#2


1  

I would suggest using hash table. There you can both delete and lookup element with constant complexity. In C++ you may use std::unordered_map(C++11) or boost::unordered_map(pre-C++11) and in java - HashMap.

我建议使用哈希表。在那里你可以删除和查找具有恒定复杂性的元素。在C ++中,您可以使用std :: unordered_map(C ++ 11)或boost :: unordered_map(pre-C ++ 11)和java - HashMap。

#1


5  

I believe the solution you hint at in your question is actually what you need, except for a small detail.

我相信你在问题中提示的解决方案实际上是你所需要的,除了一个小细节。

You suggested:

I thought I can preallocate memory for maximum number of elements it may have to store and then always put the new element at the end so that no reallocation or move of other elements is necessary.

我认为我可以预先分配内存以获得它可能必须存储的最大数量的元素,然后始终将新元素放在最后,这样就不需要重新分配或移动其他元素。

If indeed you can establish a reasonable maximum number of entries, then I would suggest you pre-allocate an array (e.g. using std::array if the maximum is known at compile time, or a std::vector otherwise) with that number of entries, establish an element count (to count the number of elements currently stored), and proceed as follows:

如果确实你可以建立一个合理的最大条目数,那么我建议你预先分配一个数组(例如,如果在编译时已知最大值,则使用std :: array,否则使用std :: vector)条目,建立元素计数(计算当前存储的元素数),并按如下方式进行:

  1. Initially you set the count to 0
  2. 最初将计数设置为0

  3. When you insert an element, you add it to the end and increment the count
  4. 插入元素时,将其添加到末尾并递增计数

  5. When you delete an element, you swap it with the last element and decrement the count
  6. 删除元素时,将其与最后一个元素交换并减少计数

  7. For random access (in the sense you described it, i.e. literally picking an element randomly) you determine a random number between 0 and count, and pick that element
  8. 对于随机访问(在你描述它的意义上,即字面上随机选取一个元素),你确定一个介于0和count之间的随机数,并选择该元素

The only detail I changed is element deletion, which I suggest you implement as switch positions with the last element.

我改变的唯一细节是元素删除,我建议你用最后一个元素实现切换位置。

Possible implementation:

#include <vector>
#include <utility>
#include <iostream>

template <typename Elem>
class randomaccesstable
{
public:
  randomaccesstable(std::size_t initial_size)
   : data_(initial_size) , count_(0)
  { }

  randomaccesstable &push_back(const Elem &elem)
  {
    if (count_ < data_.size())
      data_[count_++] = elem;
    else {
      data_.push_back(elem);
      ++count_;
    }
    return *this;
  }

  randomaccesstable &remove(const std::size_t index)
  {
    if (index < count_)
    {
      std::swap(data_[index],data_[count_-1]);
      --count_;
    }
    return *this;
  }

  const Elem &operator[](const std::size_t index) const
  { return data_[index]; }

  Elem &operator[](const std::size_t index)
  { return data_[index]; }

  std::size_t size() const
  { return count_; }

private:
  std::vector<Elem>  data_;
  std::size_t        count_;
};

int main()
{
  randomaccesstable<int> table(10);
  table.push_back(3);
  table.push_back(12);
  table.push_back(2);

  for (std::size_t i = 0 ; i < table.size() ; ++i)
    std::cout << table[i] << ' ';
  std::cout << '\n';

  table.remove(1);   // this removes the entry for 12, swapping it for 2

  for (std::size_t i = 0 ; i < table.size() ; ++i)
    std::cout << table[i] << ' ';
  std::cout << '\n';

  return 0;
}

#2


1  

I would suggest using hash table. There you can both delete and lookup element with constant complexity. In C++ you may use std::unordered_map(C++11) or boost::unordered_map(pre-C++11) and in java - HashMap.

我建议使用哈希表。在那里你可以删除和查找具有恒定复杂性的元素。在C ++中,您可以使用std :: unordered_map(C ++ 11)或boost :: unordered_map(pre-C ++ 11)和java - HashMap。