All Posts by Date or last 15, 30, 90 or 180 days.

As an Amazon Associate I earn from qualifying purchases @AMAZON

Designed for the most demanding needs of photographers and videographers.
The fastest, toughest, and most compatible portable SSD ever with speeds up to 2800MB/s.

Apple M1 Macs: Not a Mature Solution for General Usage eg Java Virtual Machine

re: Java for Apple M1 Macs
re: SHA-512 Hashing Speed in Java: 2019 Mac Pro, 2019 iMac 5K, 2020 iMac 5K, 2021 MacBook Pro M1 Max

Those using email and a web browser and a few apps... all good, so it seems.

But Apple M1 Macs are not a mature solution for some uses.

Take Java, used by IntegrityChecker Java and Zerene Stacker and certain other specialty software. While there is a native Java virtual machine for Apple ARM (M1, M1 Max, M1 Pro) machines, that does not mean it has robust support for optimal performance.

It’s not terrible; IntegrityChecker Java can still do 2.6GB/sec on an 8+2 core Apple MacBook Pro M1 Max. But that compares to 3.3GB/sec on my 2019 iMac 5K 8-core.

Reader Adam S did some nice research on just how much native-code support there is in the ARM Java for M1 Macs.

Poor Java native-code support for math libraries for M1 Mac

openjdk % find . -name macroAssembler_arm\*.cpp
./jdk/src/hotspot/cpu/arm/macroAssembler_arm.cpp <=== single source file vs 14 for Intel

openjdk % find . -name macroAssembler_x86\*.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_log10.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_arrayCopy_avx3.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_cos.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sin.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_aes.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_adler.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_md5.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_log.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_exp.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_tan.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_pow.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86.cpp
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp

Poor Java native-code support for SHA-512 for M1 Mac

Intel CPUs have strong support for the SHA-512 hash; see Fast SHA512 Implementations ® Architecture Processors.

Feels like the ARM JDK is just immature.

I grepped the OpenJDK source for the ARM SHA-512 update instruction. There are some hits, but I’m not sure what they mean yet.

openjdk % git clone https://git.openjdk.java.net/jdk/
openjdk % find . -type f -exec grep -iH SHA512H "{}" \;
./jdk/test/hotspot/gtest/aarch64/asmtest.out.h:    __ sha512h(v14, __ T2D, v3, v25);                  //       sha512h         q14, q3, v25.2D
./jdk/test/hotspot/gtest/aarch64/asmtest.out.h:    __ sha512h2(v8, __ T2D, v27, v21);                 //       sha512h2                q8, q27, v21.2D
./jdk/test/hotspot/gtest/aarch64/aarch64-asmtest.py:generate(SHA512SIMDOp, ["sha512h", "sha512h2", "sha512su0", "sha512su1"])
./jdk/src/hotspot/cpu/aarch64/assembler_aarch64.hpp:  INSN(sha512h,   0b100000);
./jdk/src/hotspot/cpu/aarch64/assembler_aarch64.hpp:  INSN(sha512h2,  0b100001);
./jdk/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp:      __ sha512h(v##i3, __ T2D, v6, v7);                                             \
./jdk/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp:      __ sha512h2(v##i3, __ T2D, v##i1, v##i0);                                      \

A wild guess is that they’re testing that you can assemble SHA512H (from C?) and that an SHA512H actually comes out. That’s different than calling the instruction in JDK SHA code, obviously. Intel SHA instructions get a lot more love:

adam@Adams-MacBook-Pro openjdk % find . -type f -exec grep -iH sha512_sse4 "{}" \;
adam@Adams-MacBook-Pro openjdk % find . -type f -exec grep -iH sha512_avx "{}" \;
./jdk/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp:    __ sha512_AVX2(msg, state0, state1, msgtmp0, msgtmp1, msgtmp2, msgtmp3, msgtmp4,
./jdk/src/hotspot/cpu/x86/stubRoutines_x86.cpp:// used in MacroAssembler::sha512_AVX2
./jdk/src/hotspot/cpu/x86/macroAssembler_x86.hpp:  void sha512_AVX2_one_round_compute(Register old_h, Register a, Register b, Register c, Register d,
./jdk/src/hotspot/cpu/x86/macroAssembler_x86.hpp:  void sha512_AVX2_one_round_and_schedule(XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7,
./jdk/src/hotspot/cpu/x86/macroAssembler_x86.hpp:  void sha512_AVX2(XMMRegister msg, XMMRegister state0, XMMRegister state1, XMMRegister msgtmp0,
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:void MacroAssembler::sha512_AVX2_one_round_compute(Register  old_h, Register a, Register b, Register c,
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:void MacroAssembler::sha512_AVX2_one_round_and_schedule(
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:void MacroAssembler::sha512_AVX2(XMMRegister msg, XMMRegister state0, XMMRegister state1, XMMRegister msgtmp0,
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    //Schedule 64 input dwords, by calling sha512_AVX2_one_round_and_schedule
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm4, xmm5, xmm6, xmm7, a, b, c, d, e, f, g, h, 0);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm4, xmm5, xmm6, xmm7, h, a, b, c, d, e, f, g, 1);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm4, xmm5, xmm6, xmm7, g, h, a, b, c, d, e, f, 2);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm4, xmm5, xmm6, xmm7, f, g, h, a, b, c, d, e, 3);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm5, xmm6, xmm7, xmm4, e, f, g, h, a, b, c, d, 0);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm5, xmm6, xmm7, xmm4, d, e, f, g, h, a, b, c, 1);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm5, xmm6, xmm7, xmm4, c, d, e, f, g, h, a, b, 2);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm5, xmm6, xmm7, xmm4, b, c, d, e, f, g, h, a, 3);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm6, xmm7, xmm4, xmm5, a, b, c, d, e, f, g, h, 0);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm6, xmm7, xmm4, xmm5, h, a, b, c, d, e, f, g, 1);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm6, xmm7, xmm4, xmm5, g, h, a, b, c, d, e, f, 2);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm6, xmm7, xmm4, xmm5, f, g, h, a, b, c, d, e, 3);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm7, xmm4, xmm5, xmm6, e, f, g, h, a, b, c, d, 0);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm7, xmm4, xmm5, xmm6, d, e, f, g, h, a, b, c, 1);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm7, xmm4, xmm5, xmm6, c, d, e, f, g, h, a, b, 2);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_and_schedule(xmm7, xmm4, xmm5, xmm6, b, c, d, e, f, g, h, a, 3);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(a, a, b, c, d, e, f, g, h, 0);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(h, h, a, b, c, d, e, f, g, 1);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(g, g, h, a, b, c, d, e, f, 2);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(f, f, g, h, a, b, c, d, e, 3);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(e, e, f, g, h, a, b, c, d, 0);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(d, d, e, f, g, h, a, b, c, 1);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(c, c, d, e, f, g, h, a, b, 2);
./jdk/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp:    sha512_AVX2_one_round_compute(b, b, c, d, e, f, g, h, a, 3);
adam@Adams-MacBook-Pro openjdk % find . -type f -exec grep -iH sha512_avx2_rorx "{}" \;
View all handpicked deals...

FUJIFILM GF 20-35mm f/4 R WR Lens
$2499 $1999
SAVE $500

diglloyd.com | Terms of Use | PRIVACY POLICY
Contact | About Lloyd Chambers | Consulting | Photo Tours
Mailing Lists | RSS Feeds | X.com/diglloyd
Copyright © 2020 diglloyd Inc, all rights reserved.
Display info: __RETINA_INFO_STATUS__