如何在Java中有效地实现数组元素的查找和删除?

Given a sorted array of objects, while the order is based on some object attribute. (Sorting is done via a List using Collections.sort() with a custom Comparator and then calling toArray()).

给定一个有序的对象数组，而order则基于一些对象属性。(排序通过使用Collections.sort()和自定义比较器的列表完成，然后调用toArray()。

Duplicate instances of SomeObject are not allowed ("duplicates" in this regard depends on multiple attribute value in SomeObject), but it's possible that multiple instances of SomeObject have the same value for attribute1, which is used for sorting.

不允许重复存在SomeObject的实例(这里的“重复”取决于SomeObject中的多个属性值)，但是可能存在这样的情况，即SomeObject的多个实例的attribute1具有相同的值，用于排序。

public SomeObject {
  public attribute1;
  public attribute2;
}

List<SomeObject> list = ...
Collections.sort(list, new Comparator<SomeObject>() {
  @Override
  public int compare(SomeObject v1, SomeObject v2) {
    if (v1.attribute1 > v2.attribute1) {
      return 1;
    } else if (v1.attribute1 < v2.attribute1) {
      return -1;
    } else
      return 0;
  }
});
SomeObject[] array = list.toArray(new SomeObject[0]);

How to efficiently check whether a certain object based on some attribute is in that array while also being able to "mark" objects already found in some previous look up (e.g. simply by removing them from the array; already found objects don't need to be accessed at later time).

如何有效地检查基于某个属性的某个对象是否在该数组中，同时还能够“标记”在以前的查找中已经发现的对象(例如，只需将它们从数组中删除);已经找到的对象以后不需要访问)。

Without the later requirement, one could do a Arrays.binarySearch() with custom Comparator. But obviously it's not working when one want to remove objects already found.

如果没有后面的要求，可以使用自定义比较器来执行Arrays.binarySearch()。但是很明显，当我们想要删除已经找到的对象时，它是不起作用的。

3 个解决方案

#1

Use a TreeSet (or TreeMultiset).

使用TreeSet(或TreeMultiset)。

You can initialize it with your comparator; it sorts itself; look-up and removal are in logarithmic time.

可以用比较器初始化它;这类本身;查找和删除是在对数时间。

You can also check for existence and remove in one step, because remove returns a boolean.

您还可以在一步中检查是否存在并删除，因为删除返回一个布尔值。

#2

Building on Arian's answer, you can also use TreeBag from Apache Commons' Collections library. This is backed by a TreeMap, and maintains a count for repeated elements.

根据Arian的回答，您也可以使用Apache Commons的集合库中的TreeBag。这由TreeMap支持，并维护重复元素的计数。

#3

If you want you can put all the elements into some sort of linked list whose nodes are also connected in a heap form when you sort them. That way, finding an element would be log n and you can still delete the nodes in place.

如果需要，可以将所有元素放入某种链表中，该链表的节点在排序时也以堆形式连接。这样，找到一个元素将是log n，并且您仍然可以删除适当的节点。

#1