Come on, of course this is not a thorough benchmark, but just a random thread in a forum, where someone wants to get a feeling for the performance of a new technology.
They could have used @benchmark instead of the @btime macro, though. The first gives you the statistics, you asked for, whereas the second one is a thin wrapper around @benchmark, that just prints the minimal time across all runs.
Nevertheless the takeaway of this thread is pretty clear, even without @benchmark: The performance difference mainly stems from SIMD instructions.
They could have used @benchmark instead of the @btime macro, though. The first gives you the statistics, you asked for, whereas the second one is a thin wrapper around @benchmark, that just prints the minimal time across all runs.
Nevertheless the takeaway of this thread is pretty clear, even without @benchmark: The performance difference mainly stems from SIMD instructions.