As I see it, the hash done while imaging takes into account the "flux" of data read from the source and written to the target disk.
As such when you verify the hash of the written image by comparing it to the one you obtained during imaging you are ONLY saying that there were not "write errors" on your target, not necessarily that there were not "read errors" on the source.
If you prefer the hashing is a way to know for sure that the image that you examined and of which you provided a copy to the other part in the trial has not been tampered with, and represent an exact "snapshot" of what was read from the device on a given date.
With perfectly working disks and perfectly working equipment/software, and in theory, what you read is actually what is on the source.
But then we need to leave alone for one moment "pure forensics" and get to "data recovery".
Hard disks do develop "bad sectors" and do have "malfunctionings".
Still in theory any modern hard disk is so intelligent to re-map a "weak sector" to a spare one "transparently", and normally the way the Disk internal OS works is (more or less):
let me read (because the OS told me to do so) sector 123456
hmmm, the ECC sector checksum (or whatever) does not match at first attempt
let me try to correct the data read through my internal (and not documented) recovery algorithm
hmmm, nope, it still does not work
let me try to impememnt the parity check algorithm (another not documented feature)
pheeew, now it matches, good
to be on the safe side, let me remap sector 123456 to spare sector 999001 (without telling the OS, nor the filesystem) and let me jolt down this new translation in my G-list (pr P-list or *whatever*)
It is perfectly possible (in theory and practice) that in the exact moment you are reading a sector this "becomes" bad.
What happens then?
The sector was "weak", but was *somehow* read correctly, it became "bad" exactly one fraction of a nanosecond after having been read, and the disk managed the issue fine.
But what if a given sector passes from "good" to "bad" immediately after you have read it?
The disk, at next occasion, finds it bad, attempts t recover it and fails (or succeeds but for *whatever* reasons fails in the copying it to the spare sector or fails in updating the list.
When you try to rehash the source drive, you will have either errors or another hash.
On the other hand, I believe that is not "common practice" to write the image from source to several targets at the same time.
So for a given period of time you have only a "source" and a "target", the same malfunctioning may happen to the "target" instead of the source (and you find it only because a new hashing of the target or of a copy of it comes out different), in which case I think that what is done s to re-image from the original.
In other words, the hashing process is an important part of the procedures but it is not the "only" solution.
A better approach could be that of doing a more granular form of hashing.
The smallest "atomic" component being a sector or "block".
So you could hash each sector by itself and create a list of hashes one for each sector or decide to group 10/10/1000/10000/100000 sectors into a "blocklist" and hash these blocklists.
This would bring IMHO two advantages:
you know for sure that ONLY a given "blocklist" is affected (and ALL the other ones are fine)
if more than one blocklist (or many or all of them) do not hash correctly then something (be it OS instability, hardware issues or *whatever*) is causing it in a "generalized" way before completing the "whole" image
jaclaz
P.S.: it seems that not only the previous idea is nothing new, but it has also been ported to a "next" level:
Distinct Sector Hashes for Target File Detection
Joel Young, Kristina Foster, and Simson Garfinkel, Naval Postgraduate School
Kevin Fairbanks, Johns Hopkins University
http://www.computer.org/csdl/mags/co/2012/12/mco2012120028.pdf
↧