Good to see this bug squelched. And a wonder that it existed (in macOS) for 4-5 years.
Apple Silicon Mac users using SoftRAID should upgrade macOS Ventura*.
* I am loathe to move from macOS Monterey to macOS Ventura on my 2021 MacBook Pro M1 Max given capricious user interface changes, and broken features in Terminal.app. Happily, I have no need for SoftRAID on that machine.
2023-05-18. By Tim Standing, lead SoftRAID developer.
Today, Apple released macOS 13.4 and along with it a long-awaited fix for an issue that some SoftRAID users have been experiencing with RAID 5 volumes. Here are the details.
Apple has changed the CPU used in Mac computers many times in the past 30 years; first from Motorola to Power PC, then from Power PC to Intel, and most recently from Intel to Apple silicon.
Each time, the macOS development team has done incredible work of hiding the fact that macOS software, your applications, and all of your files are running on a completely different type of processor. With each of these changes, not only are the instructions used by the CPUs totally different, but the mechanism used to access main memory has changed as well.
Sometimes, the change in CPU type also introduces bugs. This was the case with the transition to Macs with Apple silicon.
This transition introduced a bug which very occasionally causes a kernel panic when reading or writing files to SoftRAID RAID 5 volumes on Macs with Apple silicon. Reading or writing the same file to the same volume on an Intel Mac never fails.
This bug happens so infrequently that it took our testers more than a year to figure out a way of reproducing it. Most SoftRAID users on Macs with Apple silicon never see this problem.
We determined that this bug was in the code which transfers data to and from disks when accessing files on SoftRAID volumes. We then notified Apple engineers of the bug and helped them investigate it. After many, many months of working with Apple, there is finally a fix which has been introduced with the release of macOS 13.4.
With the release of macOS 13.4 Macs with Apple silicon are as reliable as Intel Macs when accessing SoftRAID RAID 5 volumes. So, if you're a SoftRAID user who depends on a RAID 5 configuration, go ahead and update to the macOS 13.4 for the best experience.
SoftRAID 7.5 was released earlier this year with a host of improvements, including much easier setup. SoftRAID gives you increased power and control over your RAID drives and disks that hardware RAID can’t deliver, and it doesn’t lock you into technology that you can’t upgrade or expand. If you want more information on the latest release of SoftRAID, click here.
We want to thank Apple for their cooperation in getting this bug sorted and, as always, thank you to all our awesome SoftRAID users for your patience.
MPG: imagine the consternation and frustration of the SoftRAID developers trying to reproduce, let alone diagnose and fix such an elusive problem—particularly when the issue lies in the most remove areas of the OS. Like trying to detect neutrinos in the neighborhood swimming pool.
Certain types of software bugs can devilishly difficult to track down. Hence the “works for me” non-sequitur of the uninformed. Now if only a trend would develop... my 2019 Mac Pro tends to kernel panic twice a week sometimes more—which it never used to do—all part of the “upgrades” of recent years. The odds do not seem good for Apple being on the ball enough to fix that issue on an architecture now abandoned.
My guess given that it is a generic problem and not confined to SoftRAID that this problem is/was in macOS as part of the DART system, since the macOS update fixes it.
...The DART is really an memory management unit specifically for I/O (i.e. peripherals) - and it has to cooperate with the main system memory management unit in order to, well, manage memory. The purpose of an I/O memory management unit is to facilitate access to main memory for peripherals...
On ARM systems, the stream ID is basically a number that makes it possible for the IOMMU (DART) to make lookups in the stream table to find the mappings and configuration relevant for a specific peripheral device.
In simpler terms this means that somehow a read from memory initiated by a peripheral device related to the display controller triggered an unexpected error. That could be a bug in the operating system, or more likely it would be a driver bug where some page table configuration made with the IOMMU is incorrect. It could also be a problem with a hub or display that does "something" out of spec that wasn't anticipated by Apple - although I would classify it is a macOS bug that the system crashes like this.