Released today, diglloydTools 2.1.14 has very minor changes. See the diglloydTools release notes page and download page.
Update August 9: icj version 1.0 beta 2 is now posted.
Until now, the IntegrityChecker has only been available for macOS. But now a (beta) cross-platform Java-based command line version of IntegrityChecker that can run on macOS, Windows, Linux, etc—anywhere that Java can run. Key goals:
- Enable cross-platform operation: macOS, Windows, Linux, etc.
- Enhance long-term viability across OS updates (Apple regularly breaks/deprecates APIs and introduces new file system bugs).
- Eliminate file system dependencies; enhance viability across widely differing file systems having different metadata support.
- Maintain outstanding performance: achieves as high 1200 MB/sec on a 4-core iMac 5K with all CPU cores maxed-out.
- Command line (Terminal) only. GUI support is not planned at this time.
The command line name is 'icj'. In the initial release, icj has been tested only on macOS 10.11.6 using Java SE runtime 1.8. No testing has yet been done on Windows or Linux. However, some care has been taken to avoid common issues:
- icj makes no assumptions about file system capabilities.
- icj makes no assumptions about forward/backward slashes in path names.
- icj makes no assumptions about case sensitivity; file names are used as-is.
- icj uses GMT-0 in plain text form to store the file modification date.
Existing workflows with the native version of ic are unaffected. Thus icj can be used as an adjunct or a replacement to ic. See Cross-platform Integrity Checker (Java version) — icj and the IntegrityChecker user manual.
diglloydMP:diglloydUtil lloyd$ icj verify /Volumes/Archive/2016-*
# icj version 1.0 beta 2 @ 2016-08-04 17:00
# Copyright 2016 DIGLLOYD INC. All Rights Reserved
# Use of this software requires a license. http://macperformanceguide.com/Software-License.html
# Thu Aug 04 16:26:01 PDT 2016
/Volumes/Archive/2016-0222-VideosForZeissFocusingArticle, /Volumes/Archive/2016-0304-TripPhotos, /Volumes/Archive/2016-0320-TripPhotos-CarrizoPlain,
/Volumes/Archive/2016-0628-PentaxK1-Yosemite, /Volumes/Archive/2016-0628-iPhonePanos-postProcessed, /Volumes/Archive/2016-0802-SigmaQuattro-backyard}
561 ms to scan/read 453 folders
57 ms to count files and sizes
Hashing 7689 files totaling 398 GiB in 453 folders.
0%: 60 files @ 1193 KiB/sec, 2349 MiB
1%: 167 files @ 1525 KiB/sec, 5992 MiB
2%: 236 files @ 1646 KiB/sec, 9741 MiB
97%: 7644 files @ 704 KiB/sec, 388 GiB
97%: 7654 files @ 703 KiB/sec, 389 GiB
98%: 7664 files @ 703 KiB/sec, 390 GiB
98%: 7670 files @ 703 KiB/sec, 392 GiB
99%: 7675 files @ 704 KiB/sec, 394 GiB
99%: 7681 files @ 704 KiB/sec, 396 GiB
99%: 7686 files @ 704 KiB/sec, 397 GiB
100%: 7687 files @ 704 KiB/sec, 398 GiB
FILE STATUS SUMMARY for 453 folders 2016-08-04 16:35:53
# With hash: 7689
# Without hash: 0
# Missing : 0
# Hashed: 7687
# Changed size: 0
# Changed date: 0
# Changed content + date, size unchanged: 0
# Total changed content: 0
# SUSPICIOUS: 0
icj done at Thu Aug 04 16:35:53 PDT 2016
A few examples of the capabilities in diglloydTools
Aside from testing hard drive or SSD or RAID performance and reliability with DiskTester, data integrity with IntegrityChecker is a must-have workflow tool for anyone with important data:
David C writes:
A few curiosity questions:
Were you expert in Java before or was this your initiation?
How difficult did you find writing the Java version given your background writing the c++ version? yes, I know this is subjective, but you had to have solved a lot of thorny issues in the original code, hence what I’m really asking is something vague like “how difficult did you find coding in java compared to c++?”.
If you used libraries in the c++ version, e.g. the STL or IPP, did you find it difficult to find/write equivalent java functionality?
Leaving the possible portability advantages aside did you enjoy writing the Java version (i.e. it is not the end result I’m asking about but the process of creating it)?
If you wouldn’t mind divulging it, approximately how many kloc in the icj source (your code, not the java libraries etc etc)? yes I know that kloc is a crude measure, but there certainly is a vast difference between 10kloc and 100kloc when it comes to maintaining it unless one is in the habit of leaving thousands of empty lines strewn about; the size gives me some idea what the difficulty comparison means.
What editor do you use when writing code (idle curiosity)?
DIGLLOYD: I wrote in C++ professionally for many years, then Java multithreaded server code for about 12 years, professionally. I’m a bit rusty with C++ now and STL arrived well into my professional career, but I do use it. I use Text Wrangler for editing my code. Simple and fast.
I vastly prefer working with Java. It’s just a lower hassle and complexity than C++. Also, C++ is a nightmare for portability compared to Java when all the dev environmenbts and libraries and inconsistencies in file systems are dealt with. The same API for all platforms is a huge win for Java.
Java multithreading support and garbage collection makes a lot of stuff easier. With C++, I use a lot of stack-based wrapper classes to guarantee things like deleting points or closing files; once done it’s done but it is extra code and complexity. The Java code still requires some smarts to avoid excessive memory allocation (which causes downstream hits from garbage collection), but C++ IntegrityChecker requires sophisticated memory tracking and buffer management that Java makes a lot easier in some places. The C++ version could probably utilize a 32 core machine with excellent scalability, but I have no such beast to test it on. The Java version is algorithmically mostly the same, but has some room for improvement. But it looks like bottlnecks are there that may be harder to work around (the I/O system in particular).
What I like about Java: rich set of stuff with many flavors of Set, Map, List, arrays, hash tables, etc (and with thread safe variants too). While C++ and STL have many equivalents, Java is less error prone. Also, multithreading ease of use in Java is high. Things like Executor thread pool and a variety of synchronization classes and facilities, plus garbage collection. Garbage collection often simplifies things a great deal, though it’s bad habit to substitute that for intelligent re-use of things like buffers.
The Java version is tiny compared to the C++ version, as can be seen from the binary size. I'm not sure about lines of code, but it is probably 1/8 the lines in Java because I had to build a lot more utility routines in C++ and because threading is more difficult. Also, the C++ version is more sophisticated in its asynchronous and threaded implementation when reading files (uses dedicated open/close thread plus dedicated read thread with dual I/Os in progress at all times).