There seems to be a lot of different implementations and ways to generate thread-safe Sets in Java. Some examples include
似乎有很多不同的实现和方法在Java中生成线程安全的集合。一些例子包括
1)CopyOnWriteArraySet
2) Collections.synchronizedSet(Set set)
2)Collections.synchronizedSet(Set set)
3)ConcurrentSkipListSet
4) Collections.newSetFromMap(new ConcurrentHashMap())
4)Collections.newSetFromMap(new ConcurrentHashMap())
5) Other Sets generated in a way similar to (4)
5)以类似于(4)的方式生成的其他集合
These examples come from Concurrency Pattern: Concurrent Set implementations in Java 6
这些示例来自Java 6中的并发模式:并发集实现
Could someone please simply explain the differences, advantages, and disadvantage of these examples and others? I'm having trouble understanding and keeping straight everything from the Java Std Docs.
有人可以简单解释这些例子和其他例子的差异,优点和缺点吗?我无法理解并保持Java Std Docs的所有内容。
3 个解决方案
#1
168
1) The CopyOnWriteArraySet
is a quite simple implementation - it basically has a list of elements in an array, and when changing the list, it copies the array. Iterations and other accesses which are running at this time continue with the old array, avoiding necessity of synchronization between readers and writers (though writing itself needs to be synchronized). The normally fast set operations (especially contains()
) are quite slow here, as the arrays will be searched in linear time.
1)CopyOnWriteArraySet是一个非常简单的实现 - 它基本上有一个数组中的元素列表,当更改列表时,它复制数组。此时运行的迭代和其他访问继续使用旧数组,避免了读取器和写入器之间同步的必要性(尽管写入本身需要同步)。通常快速设置的操作(特别是contains())在这里非常慢,因为将在线性时间内搜索数组。
Use this only for really small sets which will be read (iterated) often and changed seldom. (Swings listener-sets would be an example, but these are not really sets, and should be only used from the EDT anyway.)
仅将此用于非常小的集合,这些集合将经常被读取(迭代)并且很少被更改。 (Swings监听器集将是一个示例,但这些并不是真正的集合,并且应该仅在EDT中使用。)
2) Collections.synchronizedSet
will simply wrap a synchronized-block around each method of the original set. You should not access the original set directly. This means that no two methods of the set can be executed concurrently (one will block until the other finishes) - this is thread-safe, but you will not have concurrency if multiple threads are really using the set. If you use the iterator, you usually still need to synchronize externally to avoid ConcurrentModificationExceptions when modifying the set between iterator calls. The performance will be like the performance of the original set (but with some synchronization overhead, and blocking if used concurrently).
2)Collections.synchronizedSet将简单地围绕原始集的每个方法包装一个synchronized块。您不应直接访问原始集。这意味着集合中没有两个方法可以同时执行(一个将阻塞直到另一个完成) - 这是线程安全的,但如果多个线程真正使用该集合,则不会有并发性。如果使用迭代器,在修改迭代器调用之间的集合时,通常仍需要在外部进行同步以避免ConcurrentModificationExceptions。性能将类似于原始集的性能(但具有一些同步开销,并且如果同时使用则阻塞)。
Use this if you only have low concurrency, and want to be sure all changes are immediately visible to the other threads.
如果您的并发性较低,并且希望确保所有更改对其他线程立即可见,请使用此选项。
3) ConcurrentSkipListSet
is the concurrent SortedSet
implementation, with most basic operations in O(log n). It allows concurrent adding/removing and reading/iteration, where iteration may or may not tell about changes since the iterator was created. The bulk operations are simply multiple single calls, and not atomically - other threads may observe only some of them.
3)ConcurrentSkipListSet是并发的SortedSet实现,其中大多数基本操作在O(log n)中。它允许并发添加/删除和读取/迭代,其中迭代可能会或可能不会告诉自创建迭代器后的更改。批量操作只是多次单个调用,而不是原子操作 - 其他线程可能只观察其中的一些。
Obviously you can use this only if you have some total order on your elements. This looks like an ideal candidate for high-concurrency situations, for not-too-large sets (because of the O(log n)).
显然,只有在元素上有一些总订单时才可以使用它。对于高并发情况,这看起来是理想的候选者,对于不太大的集合(因为O(log n))。
4) For the ConcurrentHashMap
(and the Set derived from it): Here most basic options are (on average, if you have a good and fast hashCode()
) in O(1) (but might degenerate to O(n)), like for HashMap/HashSet. There is a limited concurrency for writing (the table is partitioned, and write access will be synchronized on the needed partition), while read access is fully concurrent to itself and the writing threads (but might not yet see the results of the changes currently being written). The iterator may or may not see changes since it was created, and bulk operations are not atomic. Resizing is slow (as for HashMap/HashSet), thus try to avoid this by estimating the needed size on creation (and using about 1/3 more of that, as it resizes when 3/4 full).
4)对于ConcurrentHashMap(以及从它派生的Set):这里大多数基本选项(平均来说,如果你有一个好的和快速的hashCode())在O(1)中(但可能退化为O(n)),喜欢HashMap / HashSet。写入的并发性有限(表是分区的,写访问将在所需的分区上同步),而读访问与自身和写线程完全并发(但可能还没有看到当前更改的结果)书面)。迭代器可能会也可能不会看到自创建以来的更改,并且批量操作不是原子操作。调整大小很慢(对于HashMap / HashSet),因此尝试通过估计创建时所需的大小来避免这种情况(并且使用大约1/3,因为它在3/4完整时调整大小)。
Use this when you have large sets, a good (and fast) hash function and can estimate the set size and needed concurrency before creating the map.
当您拥有大型集合,良好(和快速)散列函数时可以使用此函数,并且可以在创建映射之前估计集合大小和所需的并发性。
5) Are there other concurrent map implementations one could use here?
5)在这里可以使用其他并发地图实现吗?
#2
17
It is possible to combine the contains()
performance of HashSet
with the concurrency-related properties of CopyOnWriteArraySet
by using the AtomicReference<Set>
and replacing the whole set on each modification.
通过使用AtomicReference
The implementation sketch:
实施草图:
public abstract class CopyOnWriteSet<E> implements Set<E> {
private final AtomicReference<Set<E>> ref;
protected CopyOnWriteSet( Collection<? extends E> c ) {
ref = new AtomicReference<Set<E>>( new HashSet<E>( c ) );
}
@Override
public boolean contains( Object o ) {
return ref.get().contains( o );
}
@Override
public boolean add( E e ) {
while ( true ) {
Set<E> current = ref.get();
if ( current.contains( e ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.add( e );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
@Override
public boolean remove( Object o ) {
while ( true ) {
Set<E> current = ref.get();
if ( !current.contains( o ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.remove( o );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
}
#3
10
If the Javadocs don't help, you probably should just find a book or article to read about data structures. At a glance:
如果Javadocs没有帮助,你可能应该找一本书或文章来阅读有关数据结构的内容。乍看上去:
- CopyOnWriteArraySet makes a new copy of the underlying array every time you mutate the collection, so writes are slow and Iterators are fast and consistent.
- 每次改变集合时,CopyOnWriteArraySet都会生成基础数组的新副本,因此写入速度很慢,迭代器快速且一致。
- Collections.synchronizedSet() uses old-school synchronized method calls to make a Set threadsafe. This would be a low-performing version.
- Collections.synchronizedSet()使用old-school synchronized方法调用来创建Set threadsafe。这将是一个低性能的版本。
- ConcurrentSkipListSet offers performant writes with inconsistent batch operations (addAll, removeAll, etc.) and Iterators.
- ConcurrentSkipListSet提供具有不一致批处理操作(addAll,removeAll等)和迭代器的高性能写入。
- Collections.newSetFromMap(new ConcurrentHashMap()) has the semantics of ConcurrentHashMap, which I believe isn't necessarily optimized for reads or writes, but like ConcurrentSkipListSet, has inconsistent batch operations.
- Collections.newSetFromMap(new ConcurrentHashMap())具有ConcurrentHashMap的语义,我认为它不一定针对读取或写入进行优化,但是像ConcurrentSkipListSet一样,具有不一致的批处理操作。
#1
168
1) The CopyOnWriteArraySet
is a quite simple implementation - it basically has a list of elements in an array, and when changing the list, it copies the array. Iterations and other accesses which are running at this time continue with the old array, avoiding necessity of synchronization between readers and writers (though writing itself needs to be synchronized). The normally fast set operations (especially contains()
) are quite slow here, as the arrays will be searched in linear time.
1)CopyOnWriteArraySet是一个非常简单的实现 - 它基本上有一个数组中的元素列表,当更改列表时,它复制数组。此时运行的迭代和其他访问继续使用旧数组,避免了读取器和写入器之间同步的必要性(尽管写入本身需要同步)。通常快速设置的操作(特别是contains())在这里非常慢,因为将在线性时间内搜索数组。
Use this only for really small sets which will be read (iterated) often and changed seldom. (Swings listener-sets would be an example, but these are not really sets, and should be only used from the EDT anyway.)
仅将此用于非常小的集合,这些集合将经常被读取(迭代)并且很少被更改。 (Swings监听器集将是一个示例,但这些并不是真正的集合,并且应该仅在EDT中使用。)
2) Collections.synchronizedSet
will simply wrap a synchronized-block around each method of the original set. You should not access the original set directly. This means that no two methods of the set can be executed concurrently (one will block until the other finishes) - this is thread-safe, but you will not have concurrency if multiple threads are really using the set. If you use the iterator, you usually still need to synchronize externally to avoid ConcurrentModificationExceptions when modifying the set between iterator calls. The performance will be like the performance of the original set (but with some synchronization overhead, and blocking if used concurrently).
2)Collections.synchronizedSet将简单地围绕原始集的每个方法包装一个synchronized块。您不应直接访问原始集。这意味着集合中没有两个方法可以同时执行(一个将阻塞直到另一个完成) - 这是线程安全的,但如果多个线程真正使用该集合,则不会有并发性。如果使用迭代器,在修改迭代器调用之间的集合时,通常仍需要在外部进行同步以避免ConcurrentModificationExceptions。性能将类似于原始集的性能(但具有一些同步开销,并且如果同时使用则阻塞)。
Use this if you only have low concurrency, and want to be sure all changes are immediately visible to the other threads.
如果您的并发性较低,并且希望确保所有更改对其他线程立即可见,请使用此选项。
3) ConcurrentSkipListSet
is the concurrent SortedSet
implementation, with most basic operations in O(log n). It allows concurrent adding/removing and reading/iteration, where iteration may or may not tell about changes since the iterator was created. The bulk operations are simply multiple single calls, and not atomically - other threads may observe only some of them.
3)ConcurrentSkipListSet是并发的SortedSet实现,其中大多数基本操作在O(log n)中。它允许并发添加/删除和读取/迭代,其中迭代可能会或可能不会告诉自创建迭代器后的更改。批量操作只是多次单个调用,而不是原子操作 - 其他线程可能只观察其中的一些。
Obviously you can use this only if you have some total order on your elements. This looks like an ideal candidate for high-concurrency situations, for not-too-large sets (because of the O(log n)).
显然,只有在元素上有一些总订单时才可以使用它。对于高并发情况,这看起来是理想的候选者,对于不太大的集合(因为O(log n))。
4) For the ConcurrentHashMap
(and the Set derived from it): Here most basic options are (on average, if you have a good and fast hashCode()
) in O(1) (but might degenerate to O(n)), like for HashMap/HashSet. There is a limited concurrency for writing (the table is partitioned, and write access will be synchronized on the needed partition), while read access is fully concurrent to itself and the writing threads (but might not yet see the results of the changes currently being written). The iterator may or may not see changes since it was created, and bulk operations are not atomic. Resizing is slow (as for HashMap/HashSet), thus try to avoid this by estimating the needed size on creation (and using about 1/3 more of that, as it resizes when 3/4 full).
4)对于ConcurrentHashMap(以及从它派生的Set):这里大多数基本选项(平均来说,如果你有一个好的和快速的hashCode())在O(1)中(但可能退化为O(n)),喜欢HashMap / HashSet。写入的并发性有限(表是分区的,写访问将在所需的分区上同步),而读访问与自身和写线程完全并发(但可能还没有看到当前更改的结果)书面)。迭代器可能会也可能不会看到自创建以来的更改,并且批量操作不是原子操作。调整大小很慢(对于HashMap / HashSet),因此尝试通过估计创建时所需的大小来避免这种情况(并且使用大约1/3,因为它在3/4完整时调整大小)。
Use this when you have large sets, a good (and fast) hash function and can estimate the set size and needed concurrency before creating the map.
当您拥有大型集合,良好(和快速)散列函数时可以使用此函数,并且可以在创建映射之前估计集合大小和所需的并发性。
5) Are there other concurrent map implementations one could use here?
5)在这里可以使用其他并发地图实现吗?
#2
17
It is possible to combine the contains()
performance of HashSet
with the concurrency-related properties of CopyOnWriteArraySet
by using the AtomicReference<Set>
and replacing the whole set on each modification.
通过使用AtomicReference
The implementation sketch:
实施草图:
public abstract class CopyOnWriteSet<E> implements Set<E> {
private final AtomicReference<Set<E>> ref;
protected CopyOnWriteSet( Collection<? extends E> c ) {
ref = new AtomicReference<Set<E>>( new HashSet<E>( c ) );
}
@Override
public boolean contains( Object o ) {
return ref.get().contains( o );
}
@Override
public boolean add( E e ) {
while ( true ) {
Set<E> current = ref.get();
if ( current.contains( e ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.add( e );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
@Override
public boolean remove( Object o ) {
while ( true ) {
Set<E> current = ref.get();
if ( !current.contains( o ) ) {
return false;
}
Set<E> modified = new HashSet<E>( current );
modified.remove( o );
if ( ref.compareAndSet( current, modified ) ) {
return true;
}
}
}
}
#3
10
If the Javadocs don't help, you probably should just find a book or article to read about data structures. At a glance:
如果Javadocs没有帮助,你可能应该找一本书或文章来阅读有关数据结构的内容。乍看上去:
- CopyOnWriteArraySet makes a new copy of the underlying array every time you mutate the collection, so writes are slow and Iterators are fast and consistent.
- 每次改变集合时,CopyOnWriteArraySet都会生成基础数组的新副本,因此写入速度很慢,迭代器快速且一致。
- Collections.synchronizedSet() uses old-school synchronized method calls to make a Set threadsafe. This would be a low-performing version.
- Collections.synchronizedSet()使用old-school synchronized方法调用来创建Set threadsafe。这将是一个低性能的版本。
- ConcurrentSkipListSet offers performant writes with inconsistent batch operations (addAll, removeAll, etc.) and Iterators.
- ConcurrentSkipListSet提供具有不一致批处理操作(addAll,removeAll等)和迭代器的高性能写入。
- Collections.newSetFromMap(new ConcurrentHashMap()) has the semantics of ConcurrentHashMap, which I believe isn't necessarily optimized for reads or writes, but like ConcurrentSkipListSet, has inconsistent batch operations.
- Collections.newSetFromMap(new ConcurrentHashMap())具有ConcurrentHashMap的语义,我认为它不一定针对读取或写入进行优化,但是像ConcurrentSkipListSet一样,具有不一致的批处理操作。