Quantcast
Channel: Forensic Focus Forums - Recent Topics
Viewing all articles
Browse latest Browse all 20112

Forensic Software: Email deduplication

$
0
0
Deduplication of e-mail is a touchy subject. What are you going to deduplicate on? In my experience deduplication across multiple mailboxes using to, from, subject, date&time, and sometimes unique ID works, but still fraught with many issues. For example, date&time - which one? What if there are automagic timezone adjustments by client software? to - is it the verified source, the SMTP "to" field? What about alias, or "sent in name of"? Experimented with a percentage of content as part of the deduplication, but a simple version change or automatic conversion from HTML to rich text to text would mess the whole thing up. The process requires normalization of all messages to a single format, then deduplicated, then mark the matching originals. All deduplication methods should be agreed at the meet & confer - and you better be there, or you will end up with a pile of mess on your hand - like agreement to deduplicate a single mailbox . . .

Viewing all articles
Browse latest Browse all 20112

Trending Articles