joachimm wrote:
I assume you mean photorec here instead of testdisk. Yep, my bad <img src="images/smiles/icon_redface.gif" alt="Embarassed" title="Embarassed" /> , I meant Photorec and not Testdisk, of course <img src="images/smiles/icon_smile.gif" alt="Smile" title="Smile" /> .
My personal opinion is - as said - that the right way to check is for the three bytes FFD8FF (and being more "flexible" about the fourth byte), as, considering also the added mechanisms of check that Photorec has (as you pointed out), i.e. block alignment and I may add "footer" check it should be enough to avoid the largest part of "false positives".
We have to put however into account how different tools may have (even if through the same "function") a different use.
Photorec is essentially a Photo Recovery tool and not properly a "forensic" (or however "pure") carver, so it makes a lot of sense that it has "beginning of block check" (which independently from the three or four bytes header patterns will exclude a number of "embedded into other files images", including most "preview images" or "thumbnails" inserted in the EXIF data ).
TriD, being a "file identifier" has the "advantage" that it needs not such a check (since what you feed it with is an actual file and not a "random address on a disk image") and more than that it's output is "probable" file type.
For the record the "file" *nix utility has seemingly the much more "generic" two bytes pattern recognition of FFD8 :
http://darwinsys.com/file/
https://github.com/file/file/blob/master/magic/Magdir/jpeg
About the semantics, to be picky, as I am <img src="images/smiles/icon_wink.gif" alt="Wink" title="Wink" /> , the "proper" description should probably have been something *like*:
Quote::
As an example, for JPEG images, Photorec first checks if the four bytes at the beginning of a block is any among FFD8FFE0, FFD8FFE1. FFD8FFEC or FFD8FFEF, and IF any of these conditions is met, it tentatively identifies the block as the beginning of a JPEG image and then makes a number of further checks to make sure that the block belongs to a valid JPEG image, the size of the image, etc. in order to actually recover the file.
And we have to note how here:
http://www.cgsecurity.org/wiki/PhotoRec#How_PhotoRec_works
the text is:
Quote::
For example, PhotoRec identifies a JPEG file when a block begins with:
While here:
http://www.cgsecurity.org/wiki/Developers
it is:
Quote::
If the file format specifications aren't available, compare several samples to identify constant fields. In example, PhotoRec identifies a JPEG file when a block begins with:possibly the distinction/misunderstanding is between "identifying" as in "tentatively identify" and "identify" as "identify and recover without further checks".
But yes, we are both on the same side when it comes to "assumptions" and how frequent they are <img src="images/smiles/icon_smile.gif" alt="Smile" title="Smile" /> .
jaclaz
↧