Shell script to verify tar archives
Got thousands of tar archives being transferred from various machines to one repository. Found that some of the archives are bad, either because the network transfer didn't complete or the archive process was interrupted on the machine that created the archive (due to insufficient disk space, etc). Whatever the cause, needed a method to verify the archive integrity.
Tar provides options for verify the archive integrity by comparing it with the file system. In this case, I needed the ability to verify the archive without access to the original file system.
The BASH one liner below will find all archives below current directory, loop through each one attempting to list its contents, discard the archive content list, capture and log the tar command's exit status along with archive name and redirect both to a log. Exit status of 0 means the archive is good, 2 means it's bad (haven't seen a 1). Redirecting output to /dev/null while capturing exit status turned out trickier than I thought, but this seems to work well:
for f in $(find . -name "*tar.bz2"); do tar tfj $f &> /dev/null; err="$?"; echo $err $f >> tar-check.list; done
1 2 3 4 5
cat tar-check.list 2 ./755806.tar.bz2 2 ./708955.tar.bz2 0 ./313854.tar.bz2 0 ./313857.tar.bz2
BTW, the specific error I've gotten, which the above method will identify, is:
bzip2: Compressed file ends unexpectedly; perhaps it is corrupted? *Possible* reason follows. bzip2: Inappropriate ioctl for device Input file = (stdin), output file = (stdout) It is possible that the compressed file(s) have become corrupted. You can use the -tvv option to test integrity of such files. You can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now