测试最终字段的初始化安全性

I am trying to simply test out the initialization safety of final fields as guaranteed by the JLS. It is for a paper I'm writing. However, I am unable to get it to 'fail' based on my current code. Can someone tell me what I'm doing wrong, or if this is just something I have to run over and over again and then see a failure with some unlucky timing?

我试图简单地测试JLS保证的最终字段的初始化安全性。这是我写的一篇论文。但是,根据我当前的代码,我无法让它“失败”。有人可以告诉我我做错了什么,或者这只是我必须反复运行然后看到一个不幸的时机失败?

Here is my code:

这是我的代码:

public class TestClass {

    final int x;
    int y;
    static TestClass f;

    public TestClass() {
        x = 3;
        y = 4;
    }

    static void writer() {
        TestClass.f = new TestClass();
    }

    static void reader() {
        if (TestClass.f != null) {
            int i = TestClass.f.x; // guaranteed to see 3
            int j = TestClass.f.y; // could see 0

            System.out.println("i = " + i);
            System.out.println("j = " + j);
        }
    }
}

and my threads are calling it like this:

我的线程正在调用它:

public class TestClient {

    public static void main(String[] args) {

        for (int i = 0; i < 10000; i++) {
            Thread writer = new Thread(new Runnable() {
                @Override
                public void run() {
                    TestClass.writer();
                }
            });

            writer.start();
        }

        for (int i = 0; i < 10000; i++) {
            Thread reader = new Thread(new Runnable() {
                @Override
                public void run() {
                    TestClass.reader();
                }
            });

            reader.start();
        }
    }
}

I have run this scenario many, many times. My current loops are spawning 10,000 threads, but I've done with this 1000, 100000, and even a million. Still no failure. I always see 3 and 4 for both values. How can I get this to fail?

我已经多次运行这种情况了。我目前的循环产生10,000个线程,但我已经完成了1000,100000甚至一百万个。仍然没有失败。对于这两个值,我总是看到3和4。我怎么能让这个失败?

8 个解决方案

#1

From Java 5.0, you are guarenteed that all threads will see the final state set by the constructor.

从Java 5.0开始,您可以保证所有线程都能看到构造函数设置的最终状态。

If you want to see this fail, you could try an older JVM like 1.3.

如果你想看到这个失败,你可以尝试像1.3这样的旧JVM。

I wouldn't print out every test, I would only print out the failures. You could get one failure in a million but miss it. But if you only print failures, they should be easy to spot.

我不会打印出每个测试,我只打印出失败。你可能会在一百万中失败,但却错过了。但如果你只打印失败,它们应该很容易被发现。

A simpler way to see this fail is to add to the writer.

查看此失败的更简单方法是添加到编写器。

f.y = 5;

and test for

并测试

int y = TestClass.f.y; // could see 0, 4 or 5
if (y != 5)
    System.out.println("y = " + y);

#2

I wrote the spec. The TL; DR version of this answer is that just because it may see 0 for y, that doesn't mean it is guaranteed to see 0 for y.

我写了规范。 TL;这个答案的DR版本只是因为y可能看到0,这并不意味着它保证看到0表示y。

In this case, the final field spec guarantees that you will see 3 for x, as you point out. Think of the writer thread as having 4 instructions:

在这种情况下,最终的字段规范保证您将看到3为x,正如您指出的那样。将编写器线程视为具有4条指令:

r1 = <create a new TestClass instance>
r1.x = 3;
r1.y = 4;
f = r1;

The reason you might not see 3 for x is if the compiler reordered this code:

您可能看不到3 for x的原因是编译器重新排序此代码:

r1 = <create a new TestClass instance>
f = r1;
r1.x = 3;
r1.y = 4;

The way the guarantee for final fields is usually implemented in practice is to ensure that the constructor finishes before any subsequent program actions take place. Imagine someone erected a big barrier between r1.y = 4 and f = r1. So, in practice, if you have any final fields for an object, you are likely to get visibility for all of them.

通常在实践中实现最终字段的保证方式是确保构造函数在任何后续程序操作发生之前完成。想象一下,有人在r1.y = 4和f = r1之间竖起了一道巨大的障碍。因此,在实践中,如果您有一个对象的最终字段,您可能会获得所有这些字段的可见性。

Now, in theory, someone could write a compiler that isn't implemented that way. In fact, many people have often talked about testing code by writing the most malicious compiler possible. This is particularly common among the C++ people, who have lots and lots of undefined corners of their language that can lead to terrible bugs.

现在,理论上,有人可以编写一个没有这种方式实现的编译器。事实上,许多人经常谈论通过编写可能最恶意的编译器来测试代码。这在C ++人群中尤为常见,他们有很多很多未定义的语言角落,可能导致可怕的错误。

#3

I'd like to see a test which fails or an explanation why it's not possible with current JVMs.

我希望看到一个测试失败或解释为什么当前的JVM无法实现。

Multithreading and Testing

多线程和测试

You can't prove that a multithreaded application is broken (or not) by testing for several reasons:

由于以下几个原因,您无法通过测试证明多线程应用程序已损坏(或未损坏):

the problem might only appear once every x hours of running, x being so high that it is unlikely that you see it in a short test

问题可能每运行x小时才出现一次,x太高,以至于您不太可能在短时间内看到它

the problem might only appear with some combinations of JVM / processor architectures

问题可能只出现在JVM /处理器体系结构的某些组合中

In your case, to make the test break (i.e. to observe y == 0) would require the program to see a partially constructed object where some fields have been properly constructed and some not. This typically does not happen on x86 / hotspot.

在你的情况下,为了使测试中断(即观察y == 0)将要求程序看到部分构造的对象,其中一些字段已经正确构造而一些字段没有。这通常不会发生在x86 / hotspot上。

How to determine if a multithreaded code is broken?

如何确定多线程代码是否被破坏?

The only way to prove that the code is valid or broken is to apply the JLS rules to it and see what the outcome is. With data race publishing (no synchronization around the publication of the object or of y), the JLS provides no guarantee that y will be seen as 4 (it could be seen with its default value of 0).

证明代码有效或损坏的唯一方法是将JLS规则应用于它并查看结果。使用数据竞争发布(没有围绕对象或y的发布进行同步),JLS不保证y将被视为4(可以看到它的默认值为0)。

Can that code really break?

该代码真的可以破解吗?

In practice, some JVMs will be better at making the test fail. For example some compilers (cf "A test case showing that it doesn't work" in this article) could transform TestClass.f = new TestClass(); into something like (because it is published via a data race):

在实践中,一些JVM会更好地使测试失败。例如,一些编译器(参见本文中的“测试用例表明它不起作用”)可以转换TestClass.f = new TestClass();变成类似的东西(因为它是通过数据竞赛发布的):

(1) allocate memory
(2) write fields default values (x = 0; y = 0) //always first
(3) write final fields final values (x = 3)    //must happen before publication
(4) publish object                             //TestClass.f = new TestClass();
(5) write non final fields (y = 4)             //has been reodered after (4)

The JLS mandates that (2) and (3) happen before the object publication (4). However, due to the data race, no guarantee is given for (5) - it would actually be a legal execution if a thread never observed that write operation. With the proper thread interleaving, it is therefore conceivable that if reader runs between 4 and 5, you will get the desired output.

JLS要求(2)和(3)在对象出版物(4)之前发生。但是,由于数据竞争,没有给出(5)的保证 - 如果一个线程从未观察到写操作,它实际上是合法的执行。通过适当的线程交错,可以想象如果读取器在4到5之间运行,您将获得所需的输出。

I don't have a symantec JIT at hand so can't prove it experimentally :-)

我手边没有赛门铁克JIT所以无法通过实验证明:-)

#4

Here is an example of default values of non final values being observed despite that the constructor sets them and doesn't leak this. This is based off my other question which is a bit more complicated. I keep seeing people say it can't happen on x86, but my example happens on x64 linux openjdk 6...

下面是一个非最终值默认值的示例,尽管构造函数设置它们并且不会泄漏它。这是基于我的另一个问题,这个问题有点复杂。我一直看到人们说它不能在x86上发生,但我的例子发生在x64 linux openjdk 6 ...

#5

-1

What about you modified the constructor to do this:

你怎么修改构造函数来做到这一点:

public TestClass() {
 Thread.sleep(300);
   x = 3;
   y = 4;
}

I am not an expert on JLF finals and initializers, but common sense tells me this should delay setting x long enough for writers to register another value?

我不是JLF决赛和初始化者的专家,但常识告诉我这应该延迟设置x足够让作家注册另一个值?

#6

-2

What if one changes the scenario into

如果将场景更改为,该怎么办?

public class TestClass {

    final int x;
    static TestClass f;

    public TestClass() {
        x = 3;
    }

    int y = 4;

    // etc...

}

#7

-2

Better understanding of why this test does not fail can come from understanding of what actually happens when constructor is invoked. Java is a stack-based language. TestClass.f = new TestClass(); consists of four action. First new instruction is called, its like malloc in C/C++, it allocates memory and places a reference to it on the top of the stack. Then reference is duplicated for invoking a constructor. Constructor in fact is like any other instance method, its invoked with the duplicated reference. Only after that reference is stored in the method frame or in the instance field and becomes accessible from anywhere else. Before the last step reference to the object is present only on the top of creating thread's stack and no body else can see it. In fact there is no difference what kind of field you are working with, both will be initialized if TestClass.f != null. You can read x and y fields from different objects, but this will not result in y = 0. For more information you should see JVM Specification and Stack-oriented programming language articles.

更好地理解为什么这个测试没有失败可以来自对构造函数被调用时实际发生的事情的理解。 Java是一种基于堆栈的语言。 TestClass.f = new TestClass();由四个动作组成。调用第一个新指令,就像C / C ++中的malloc一样,它分配内存并在堆栈顶部放置一个引用。然后重复引用以调用构造函数。实际上,构造函数与任何其他实例方法一样,它使用重复的引用调用。只有在该引用存储在方法框架或实例字段中之后,才能从其他任何地方访问。在最后一步之前,对象的引用仅出现在创建线程堆栈的顶部,其他任何人都无法看到它。实际上,使用哪种字段没有区别,如果TestClass.f!= null,它们都将被初始化。您可以从不同的对象中读取x和y字段,但这不会导致y = 0.有关更多信息,您应该看到JVM规范和面向堆栈的编程语言文章。

UPD: One important thing I forgot to mention. By java memory there is no way to see partially initialized object. If you do not do self publications inside constructor, sure.

UPD:我忘了提到一件重要的事情。通过java内存,无法查看部分初始化的对象。如果你没有在构造函数中做自我出版物,当然。

JLS:

An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.

当构造函数完成时,对象被认为是完全初始化的。在该对象完全初始化之后只能看到对象引用的线程可以保证看到该对象的最终字段的正确初始化值。

JLS:

There is a happens-before edge from the end of a constructor of an object to the start of a finalizer for that object.

从对象的构造函数的末尾到该对象的终结器的开始有一个发生前的边缘。

Broader explanation of this point of view:

对这一观点的更广泛的解释:

It turns out that the end of an object's constructor happens-before the execution of its finalize method. In practice, what this means is that any writes that occur in the constructor must be finished and visible to any reads of the same variable in the finalizer, just as if those variables were volatile.

事实证明,对象的构造函数的结束发生在执行其finalize方法之前。实际上,这意味着构造函数中发生的任何写入都必须完成,并且对终结器中相同变量的任何读取都是可见的,就像这些变量是volatile一样。

UPD: That was the theory, let's turn to practice.

UPD:那是理论,让我们转向练习。

Consider the following code, with simple non-final variables:

考虑以下代码,使用简单的非最终变量:

public class Test {

    int myVariable1;
    int myVariable2;

    Test() {
        myVariable1 = 32;
        myVariable2 = 64;
    }

    public static void main(String args[]) throws Exception {
        Test t = new Test();
        System.out.println(t.myVariable1 + t.myVariable2);
    }
}

The following command displays machine instructions generated by java, how to use it you can find in a wiki:

以下命令显示java生成的机器指令,如何使用它可以在wiki中找到:

java.exe -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -Xcomp -XX:PrintAssemblyOptions=hsdis-print-bytes -XX:CompileCommand=print,*Test.main Test

java.exe -XX:+ UnlockDiagnosticVMOptions -XX:+ PrintAssembly -Xcomp -XX:PrintAssemblyOptions = hsdis-print-bytes -XX:CompileCommand = print,* Test.main Test

It's output:

...
0x0263885d: movl   $0x20,0x8(%eax)    ;...c7400820 000000
                                    ;*putfield myVariable1
                                    ; - Test::<init>@7 (line 12)
                                    ; - Test::main@4 (line 17)
0x02638864: movl   $0x40,0xc(%eax)    ;...c7400c40 000000
                                    ;*putfield myVariable2
                                    ; - Test::<init>@13 (line 13)
                                    ; - Test::main@4 (line 17)
0x0263886b: nopl   0x0(%eax,%eax,1)   ;...0f1f4400 00
...

Field assignments are followed by NOPL instruction, one of it's purposes is to prevent instruction reordering.

字段分配后面是NOPL指令,其中一个目的是防止指令重新排序。

Why does this happen? According to specification finalization happens after constructor returns. So GC thread cant see a partially initialized object. On a CPU level GC thread is not distinguished from any other thread. If such guaranties are provided to GC, than they are provided to any other thread. This is the most obvious solution to such restriction.

为什么会这样?根据规范,在构造函数返回后进行终结。所以GC线程无法看到部分初始化的对象。在CPU级别上,GC线程与任何其他线程都不区分。如果向GC提供此类保证,则将其提供给任何其他线程。这是这种限制最明显的解决方案。

Results:

1) Constructor is not synchronized, synchronization is done by other instructions.

1)构造函数未同步,同步由其他指令完成。

2) Assignment to object's reference cant happen before constructor returns.

2)对象的引用的赋值在构造函数返回之前发生。

#8

-3

What's going on in this thread? Why should that code fail in the first place?

这个帖子里发生了什么?为什么那个代码首先会失败?

You launch 1000s of threads that will each do the following:

您将启动1000个线程,每个线程将执行以下操作:

TestClass.f = new TestClass();

What that does, in order:

那是做什么的,按顺序:

evaluate TestClass.f to find out its memory location

评估TestClass.f以找出其内存位置

evaluate new TestClass(): this creates a new instance of TestClass, whose constructor will initialize both x and y

评估新的TestClass():这将创建一个TestClass的新实例,其构造函数将初始化x和y

assign the right-hand value to the left-hand memory location

将右侧值分配给左侧内存位置

An assignment is an atomic operation which is always performed after the right-hand value has been generated. Here is a citation from the Java language spec (see the first bulleted point) but it really applies to any sane language.

赋值是一种原子操作,它总是在生成右手值之后执行。这是来自Java语言规范的引用(参见第一个项目符号),但它确实适用于任何理智的语言。

This means that while the TestClass() constructor is taking its time to do its job, and x and y could conceivably still be zero, the reference to the partially initialized TestClass object only lives in that thread's stack, or CPU registers, and has not been written to TestClass.f

这意味着虽然TestClass()构造函数正在花费时间来完成它的工作,并且x和y可能仍然是零,但对部分初始化的TestClass对象的引用只存在于该线程的堆栈或CPU寄存器中,并且还没有已写入TestClass.f

Therefore TestClass.f will always contain:

因此TestClass.f将始终包含:

either null, at the start of your program, before anything else is assigned to it,

在程序开始时,在为其分配任何其他内容之前,为null,

or a fully initialized TestClass instance.

或者完全初始化的TestClass实例。

#1

From Java 5.0, you are guarenteed that all threads will see the final state set by the constructor.

从Java 5.0开始,您可以保证所有线程都能看到构造函数设置的最终状态。

If you want to see this fail, you could try an older JVM like 1.3.

如果你想看到这个失败,你可以尝试像1.3这样的旧JVM。

I wouldn't print out every test, I would only print out the failures. You could get one failure in a million but miss it. But if you only print failures, they should be easy to spot.

我不会打印出每个测试,我只打印出失败。你可能会在一百万中失败,但却错过了。但如果你只打印失败,它们应该很容易被发现。

A simpler way to see this fail is to add to the writer.

查看此失败的更简单方法是添加到编写器。

f.y = 5;

and test for

并测试

int y = TestClass.f.y; // could see 0, 4 or 5
if (y != 5)
    System.out.println("y = " + y);

#2

I wrote the spec. The TL; DR version of this answer is that just because it may see 0 for y, that doesn't mean it is guaranteed to see 0 for y.

我写了规范。 TL;这个答案的DR版本只是因为y可能看到0,这并不意味着它保证看到0表示y。

In this case, the final field spec guarantees that you will see 3 for x, as you point out. Think of the writer thread as having 4 instructions:

在这种情况下,最终的字段规范保证您将看到3为x,正如您指出的那样。将编写器线程视为具有4条指令:

r1 = <create a new TestClass instance>
r1.x = 3;
r1.y = 4;
f = r1;

The reason you might not see 3 for x is if the compiler reordered this code:

您可能看不到3 for x的原因是编译器重新排序此代码:

r1 = <create a new TestClass instance>
f = r1;
r1.x = 3;
r1.y = 4;

#3

I'd like to see a test which fails or an explanation why it's not possible with current JVMs.

我希望看到一个测试失败或解释为什么当前的JVM无法实现。

Multithreading and Testing

多线程和测试

You can't prove that a multithreaded application is broken (or not) by testing for several reasons:

由于以下几个原因,您无法通过测试证明多线程应用程序已损坏(或未损坏):

the problem might only appear once every x hours of running, x being so high that it is unlikely that you see it in a short test

问题可能每运行x小时才出现一次,x太高,以至于您不太可能在短时间内看到它

the problem might only appear with some combinations of JVM / processor architectures

问题可能只出现在JVM /处理器体系结构的某些组合中

How to determine if a multithreaded code is broken?

如何确定多线程代码是否被破坏?

Can that code really break?

该代码真的可以破解吗?

(1) allocate memory
(2) write fields default values (x = 0; y = 0) //always first
(3) write final fields final values (x = 3)    //must happen before publication
(4) publish object                             //TestClass.f = new TestClass();
(5) write non final fields (y = 4)             //has been reodered after (4)

I don't have a symantec JIT at hand so can't prove it experimentally :-)

我手边没有赛门铁克JIT所以无法通过实验证明:-)

#4

#5

-1

What about you modified the constructor to do this:

你怎么修改构造函数来做到这一点:

public TestClass() {
 Thread.sleep(300);
   x = 3;
   y = 4;
}

I am not an expert on JLF finals and initializers, but common sense tells me this should delay setting x long enough for writers to register another value?

我不是JLF决赛和初始化者的专家,但常识告诉我这应该延迟设置x足够让作家注册另一个值?

#6

-2

What if one changes the scenario into

如果将场景更改为,该怎么办?

public class TestClass {

    final int x;
    static TestClass f;

    public TestClass() {
        x = 3;
    }

    int y = 4;

    // etc...

}

#7

-2

UPD: One important thing I forgot to mention. By java memory there is no way to see partially initialized object. If you do not do self publications inside constructor, sure.

UPD:我忘了提到一件重要的事情。通过java内存,无法查看部分初始化的对象。如果你没有在构造函数中做自我出版物,当然。

JLS:

An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.

当构造函数完成时,对象被认为是完全初始化的。在该对象完全初始化之后只能看到对象引用的线程可以保证看到该对象的最终字段的正确初始化值。

JLS:

There is a happens-before edge from the end of a constructor of an object to the start of a finalizer for that object.

从对象的构造函数的末尾到该对象的终结器的开始有一个发生前的边缘。

Broader explanation of this point of view:

对这一观点的更广泛的解释:

It turns out that the end of an object's constructor happens-before the execution of its finalize method. In practice, what this means is that any writes that occur in the constructor must be finished and visible to any reads of the same variable in the finalizer, just as if those variables were volatile.

事实证明,对象的构造函数的结束发生在执行其finalize方法之前。实际上,这意味着构造函数中发生的任何写入都必须完成,并且对终结器中相同变量的任何读取都是可见的,就像这些变量是volatile一样。

UPD: That was the theory, let's turn to practice.

UPD:那是理论,让我们转向练习。

Consider the following code, with simple non-final variables:

考虑以下代码,使用简单的非最终变量:

public class Test {

    int myVariable1;
    int myVariable2;

    Test() {
        myVariable1 = 32;
        myVariable2 = 64;
    }

    public static void main(String args[]) throws Exception {
        Test t = new Test();
        System.out.println(t.myVariable1 + t.myVariable2);
    }
}

The following command displays machine instructions generated by java, how to use it you can find in a wiki:

以下命令显示java生成的机器指令,如何使用它可以在wiki中找到:

java.exe -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -Xcomp -XX:PrintAssemblyOptions=hsdis-print-bytes -XX:CompileCommand=print,*Test.main Test

java.exe -XX:+ UnlockDiagnosticVMOptions -XX:+ PrintAssembly -Xcomp -XX:PrintAssemblyOptions = hsdis-print-bytes -XX:CompileCommand = print,* Test.main Test

It's output:

...
0x0263885d: movl   $0x20,0x8(%eax)    ;...c7400820 000000
                                    ;*putfield myVariable1
                                    ; - Test::<init>@7 (line 12)
                                    ; - Test::main@4 (line 17)
0x02638864: movl   $0x40,0xc(%eax)    ;...c7400c40 000000
                                    ;*putfield myVariable2
                                    ; - Test::<init>@13 (line 13)
                                    ; - Test::main@4 (line 17)
0x0263886b: nopl   0x0(%eax,%eax,1)   ;...0f1f4400 00
...

Field assignments are followed by NOPL instruction, one of it's purposes is to prevent instruction reordering.

字段分配后面是NOPL指令,其中一个目的是防止指令重新排序。

Results:

1) Constructor is not synchronized, synchronization is done by other instructions.

1)构造函数未同步,同步由其他指令完成。

2) Assignment to object's reference cant happen before constructor returns.

2)对象的引用的赋值在构造函数返回之前发生。

#8

-3

What's going on in this thread? Why should that code fail in the first place?

这个帖子里发生了什么?为什么那个代码首先会失败?

You launch 1000s of threads that will each do the following:

您将启动1000个线程,每个线程将执行以下操作:

TestClass.f = new TestClass();

What that does, in order:

那是做什么的,按顺序:

evaluate TestClass.f to find out its memory location

评估TestClass.f以找出其内存位置

evaluate new TestClass(): this creates a new instance of TestClass, whose constructor will initialize both x and y

评估新的TestClass():这将创建一个TestClass的新实例,其构造函数将初始化x和y

assign the right-hand value to the left-hand memory location

将右侧值分配给左侧内存位置

赋值是一种原子操作,它总是在生成右手值之后执行。这是来自Java语言规范的引用(参见第一个项目符号),但它确实适用于任何理智的语言。

Therefore TestClass.f will always contain:

因此TestClass.f将始终包含:

either null, at the start of your program, before anything else is assigned to it,

在程序开始时,在为其分配任何其他内容之前,为null,

or a fully initialized TestClass instance.

或者完全初始化的TestClass实例。

秒客网

测试最终字段的初始化安全性

8 个解决方案

#1

#2

#3

#4

#5

#6

#7

#8

#1

#2

#3

#4

#5

#6

#7

#8

相关文章