I agree to the point that programming with streams is nice and easier for some scenarios but when we’re losing out on performance, why do we need to use them?
Performance is rarely an issue. It would be usual for 10% of your streams would need to be rewritten as loops to get the performance you need.
Is there something I’m missing out on?
Using parallelStream() is much easier using streams and possibly more efficient as it’s hard to write efficient concurrent code.
Which is the scenario in which streams perform equal to loops? Is it only in the case where your function defined takes a lot of time, resulting in a negligible loop performance?
Your benchmark is flawed in the sense that the code hasn’t been compiled when it starts. I would do the whole test in a loop as JMH does, or I would use JMH.
In none of the scenario’s I could see streams taking advantage of branch-prediction
Branch prediction is a CPU feature not a JVM or streams feature.