diglloydTools IntegrityChecker: Major Release Coming Soon
See data integrity, including recent post Data Loss Prevented: IntegrityChecker Saves my Bacon by Detecting Corrupted Files after a Clone bit rot.
IntegrityChecker java runs on any platform with a JVM. I’m looking for Windows, Linux, and NAS users to test IntegrityChecker java. Contact Lloyd.
A major update to diglloydTools IntegrityChecker is coming soon. It includes many nice changes, but the key things are noted below.
New feature: folder hash hierarchies
Folder hash hierarchies record all hash values for an entire folder hierarchy or volume in a single ".icjh" file. These case can be applied to any folder and can be “nested”, and are intelligently updated. All top level folders specified automatically get their own hierarchy file.
Hierarchy files can also be used in combination with traditional every-folder “.icj” files. This may be useful when rearranging lots of folders, in that the ".icj" file travels with every folder. But for most backups and data verification purposes, the ".icj" files are not superfluous.
For folders like my Apple Mail folder with 113784 folder and growing, that’s one (1) ".icjh" file instead of 113784 ".icj" files which eliminates the need to read/write 113784 ".icjh" files—which is also a bonus for minimizing backup activity. And there is absolutely no value in having those files in each folder, since the mail folder is never rearranged and always backed up in it entirety.
Hash data for 113784 folders containing 387296 files loaded in 12109 ms.
Hierarchy files also mean that one can use icj on ".git" hierarchies without git complaining about unwanted files in its 'objects' and 'pack' folders. Ditto for any program which dislikes unknown files in its folder hierarchy, library, etc.
A folder hash hierarchy also makes it possible to hash read-only media and store that hash file elsewhere. This can be implemented if interest in it is seen.
New feature: in-place update but with full compatibility
IntegrityChecker java maintains full compatibility with prior hash formats, include the native version ".ic" format (SHA1). Backward 'verify' compatibility is essential, since many pros burn their work to read-only media—read-only media cannot be updated or changed.
A revised hash format (chained SHA512) is introduced with the update command automatically verifying the existing hash and computing the new and faster SHA512 hash. In this way, updating is seamless and results in the best possible speed once updated.
New feature: in-place update but with full compatibility
IntegrityChecker java maintains full compatibility with prior hash formats, include the native version ".ic" format (SHA1). Backward 'verify' compatibility is essential, since many pros burn their work to read-only media—read-only media cannot be updated or changed.
A revised hash format (chained SHA512) is introduced with the update command automatically verifying the existing hash and computing the new and faster SHA512 hash. In this way, updating is seamless and results in the best possible speed once updated.
Performance
Performance is terrific — shown below is icj on a 2.5 GHz 28-core Mac Pro doing 12.5 GB/sec hashing (12500 MB/sec). That’s about 446 MB/sec per real core which are running at a relatively slow 2.5 Ghz each.
File globbing
Not yet implemented but planned soon: file globbing, which will allow precise control of which files are hashed and which are not, folders to be included or excluded, etc.
