Scrubbing your HDD/SSD data to mitigate bitrot (without 3rd party software)

Note: Excuse me if this post appears mirrored. I could not fix it, but you can try pressing Ctrl+Shift+X or Ctrl+Alt+X and hope that your browser switches the page’s reading-direction.

Bitrot is a real threat, regardless of what storage media is being used. And while some file systems (e.g. ReFS, ZFS) implement automatic measures to protect against bitrot, they might not be readily available, or implementing them could mean too much technical knowledge or work than one can afford.

Besides, self-healing or resilient file systems aren’t practical for certain types of storage media, such as USB drives (containing flash banks or HDDs).

For the unprotected file systems, one can only cross their fingers and hope that the data will keep its integrity. However, there are applications out there that do a raw data read to help the drive detect and correct any potential or partial read errors before they become permanent. The work principle is pretty simple, the program just asks for the contents of the file. The drive contains the error-correction mechanisms in case some data block has correctable errors.

I personally prefer to do this kind of check on my portable drives without using any exotic tools. On Windows (while the performance is not great this is a quick and easy way), I just open a command prompt with administrator privileges, and issue the following instructions (for scrubbing drive D:), which cause the system to read all files in the current directory and its sub-directories recursively, sending the output from this read operation to the NUL special device, effectively not outputting anything to the screen other than the file name:

CD \
FOR /r %i IN (*) DO TYPE "%i" > NUL
Drive read activity

On Linux I use the dd command to read the whole drive or partition.
Assuming the partition is /dev/sdb2 (you can use lsblk to list all block devices and their partitions), this is the command:

dd if=/dev/sdb2 of=/dev/null status=progress bs=16K conv=noerror

There must be a better way on Linux to check only the files, not the whole partition, but I really didn’t get to need to do it that much, so I’m leaving it here.



An article appeared recently on explaining how an attack called “Typosquatting” works. This is not a new concept, but the implementation is interesting.

The way this works is that an attacker crafts a malware, then this malware itself (or a code that downloads and activates it) is then implanted in an innocent-looking code library.
The trick is to make a copy of a commonly used component or code library, name this copy with a name similar to original name but with a slight variation in the spelling which can result from a common typo. The malware is included in the fake copy explicitly or in a obfuscated manner to make it harder to get detected.

An well-intentioned developer writing code that uses that library writes an include statement that mistakenly references the malware-infused library, and that’s it. The bad library should function just like the original one, with some added (undesired) functionalities.

I could imaging this happening in a different way, given the current broad and expansive use of open source components, and the complexity of dependencies.
E.g. In the context of an open source project, the maintainers rely on other open source projects, and they trust the maintainers of those projects to provide well-maintained code. The 2nd level maintainers also depend on other projects, and the chain goes on. Many, if not all, of these open source projects rely on communal support and code enhancements. Eventually someone somewhere will go rouge and intentionally push an update that references IibraryO instead of library0, which the project maintainers might overlook because the two words look similar, but this action will have an effect which cascades over tens of dependent projects causing them all to become vulnerable on the next compilation.

I think the solution to this problem is to implement a black-listing system maintained by the same open-source communities; this system would include a way to submit suspicious projects, and a scoring mechanism to indicate the reputation of a specific component.