JVM常用概念之锁省略

发布于:2025-03-17 ⋅ 阅读:(19) ⋅ 点赞:(0)

问题

synchronized(同步-重量级锁)会解除所有编译器优化吗?

基础知识

使用当前的 Java 内存模型,未观察到的锁不一定会产生任何内存效应。除其他情况外,这意味着对非共享对象进行同步是徒劳的,因此运行时不必在那里做任何事情。这给编译优化提供了优化的机会。

因此,如果逃逸分析发现对象是非逃逸的,编译器就可以自由地消除同步。

实验

测试用例

在新对象上使用和不使用同步来增加值。

源码

import org.openjdk.jmh.annotations.*;

@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(3)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
public class LockElision {

    int x;

    @Benchmark
    public void baseline() {
        x++;
    }

    @Benchmark
    public void locked() {
        synchronized (new Object()) {
            x++;
        }
    }
}

通过-prof perfnorm进行执行上述测试用例,执行结果如下:

Benchmark                                   Mode  Cnt   Score    Error  Units

LockElision.baseline                        avgt   15   0.268 ±  0.001  ns/op
LockElision.baseline:CPI                    avgt    3   0.200 ±  0.009   #/op
LockElision.baseline:L1-dcache-loads        avgt    3   2.035 ±  0.101   #/op
LockElision.baseline:L1-dcache-stores       avgt    310⁻³            #/op
LockElision.baseline:branches               avgt    3   1.016 ±  0.046   #/op
LockElision.baseline:cycles                 avgt    3   1.017 ±  0.024   #/op
LockElision.baseline:instructions           avgt    3   5.076 ±  0.346   #/op

LockElision.locked                          avgt   15   0.268 ±  0.001  ns/op
LockElision.locked:CPI                      avgt    3   0.200 ±  0.005   #/op
LockElision.locked:L1-dcache-loads          avgt    3   2.024 ±  0.237   #/op
LockElision.locked:L1-dcache-stores         avgt    310⁻³            #/op
LockElision.locked:branches                 avgt    3   1.014 ±  0.047   #/op
LockElision.locked:cycles                   avgt    3   1.015 ±  0.012   #/op
LockElision.locked:instructions             avgt    3   5.062 ±  0.154   #/op

测试结果完全相同:时间相同,加载、存储、周期、指令的数量相同。很有可能,这意味着生成的代码是相同的。查看汇编代码,如下所示:

14.50%   16.97%  ↗  incl   0xc(%r8)              ; increment field
76.82%   76.05%  │  movzbl 0x94(%r9),%r10d       ; JMH infra: do another @Benchmark
 0.83%    0.10%  │  add    $0x1,%rbp
 0.47%    0.78%  │  test   %eax,0x15ec6bba(%rip)
 0.47%    0.36%  │  test   %r10d,%r10d
                 ╰  je     BACK

锁被完全省略了,分配和同步都消失了。如果我们在运行时添加JVM参数:-XX:-EliminateLocks ,或者使用-XX:-DoEscapeAnalysis禁用 EA(这会破坏所有依赖于 EA 的优化,包括锁省略),那么locked计数器就会膨胀,并显示分配和简单同步的成本,执行结果如下:

Benchmark                                   Mode  Cnt   Score    Error  Units

LockElision.baseline                        avgt   15   0.268 ±  0.001  ns/op
LockElision.baseline:CPI                    avgt    3   0.200 ±  0.001   #/op
LockElision.baseline:L1-dcache-loads        avgt    3   2.029 ±  0.082   #/op
LockElision.baseline:L1-dcache-stores       avgt    3   0.001 ±  0.001   #/op
LockElision.baseline:branches               avgt    3   1.016 ±  0.028   #/op
LockElision.baseline:cycles                 avgt    3   1.015 ±  0.014   #/op
LockElision.baseline:instructions           avgt    3   5.078 ±  0.097   #/op

LockElision.locked                          avgt   15  11.590 ±  0.009  ns/op
LockElision.locked:CPI                      avgt    3   0.998 ±  0.208   #/op
LockElision.locked:L1-dcache-loads          avgt    3  11.872 ±  0.686   #/op
LockElision.locked:L1-dcache-stores         avgt    3   5.024 ±  1.019   #/op
LockElision.locked:branches                 avgt    3   9.027 ±  1.840   #/op
LockElision.locked:cycles                   avgt    3  44.236 ±  3.364   #/op
LockElision.locked:instructions             avgt    3  44.307 ±  9.954   #/op

总结

锁省略是逃逸分析启用的另一项优化,它删除了一些多余的同步。当内部同步实现没有逃逸到野外时,这尤其有益:然后,我们可以完全放弃同步!