Empty methods noticeably slower in Java 11 than Java 8

Question

You are measuring empty benchmarks, not empty methods. In other words, measuring the minimal infrastructure code that handles the benchmark itself. This is easy to dissect, because you’d expect only a few instructions on the hot path. JMH’s -prof perfasm or -prof xperfasm would give you those hottest instructions in seconds.

I think the effect is due to Thread-Local Handshakes (JEP 312), see:

8u191: 0.389 ± 0.029 ns/op
[so far so good]

  3.60%  ↗  ...a2: movzbl 0x94(%r8),%r10d
  0.63%  │  ...aa: add    $0x1,%rbp
 32.82%  │  ...ae: test   %eax,0x1765654c(%rip) ; global safepoint poll
 58.14%  │  ...b4: test   %r10d,%r10d
         ╰  ...b7: je     ...a2

11.0.2: 0.585 ± 0.014 ns/op [oops, regression]

  0.31%  ↗  ...70: movzbl 0x94(%r9),%r10d    
  0.19%  │  ...78: mov    0x108(%r15),%r11  ; reading the thread-local poll addr
 25.62%  │  ...7f: add    $0x1,%rbp          
 35.10%  │  ...83: test   %eax,(%r11)       ; thread-local safepoint poll
 34.91%  │  ...86: test   %r10d,%r10d
         ╰  ...89: je     ...70

11.0.2, -XX:-ThreadLocalHandshakes: 0.399 ± 0.048 ns/op [back to 8u perf]

  5.64%  ↗  ...62: movzbl 0x94(%r8),%r10d    
  0.91%  │  ...6a: add    $0x1,%rbp          
 34.36%  │  ...6e: test   %eax,0x179be88c(%rip) ; global safepoint poll
 54.79%  │  ...74: test   %r10d,%r10d
         ╰  ...77: je     ...62

I think this is largely visible mostly in tight loops like this one.

UPD: Hopefully, more details here.

Leave a Comment Cancel reply