top of page

Email exceptions reports may include information about encrypted emails which cannot be opened. Two common encrypted emails are Secure/Multipurpose Internet Mail Extensions (S/MIME) emails and restricted permission messages (RPMSG).


S/MIME emails include a digital signature added by the sender to verify his or her identity. The recipient has public key which is used to encrypt the message, and a private key which is used to decrypt the message. So S/MIMEs serve a dual purpose by both authenticating the author, and protecting the contents of the message. Microsoft Exchange supports S/MIME.


RPMSG messages are Outlook encrypted emails which may be set to prevent them from being printed, forwarded or copied by a recipient. A message with restricted permissions will have the extension 'rpmsg', and may appear as an attachment in another email. It will not be possible to open the message unless you are using the account of the recipient, and it may be necessary to use Outlook's specific rights management system. It may still be possible to review these messages by having the original custodian collect his or her email data.






More than once I've come across email attachments in document productions for which the file name was 'winmail.dat' and the attachment had not been processed. This problem stems from a well-known problem that Microsoft Support has addressed here. Some email clients cannot process emails sent from MS Outlook that are in the rich text format. The message is sent in plain text and the .dat file contains the rich text formatting, embedded images and file attachments. This method is known as TNEF, Transport Neutral Encapsulation Format.

This error causes a minor security breach as the sender's login user name, and .pst folder paths can be found if the file is opened in a text editor.

See this example from the Enron Email data set.

It's hard to say if the EDRM's own processing stripped out some of the original information, but we can see a file path and what may be a login ID.



Andre Ross gives a very good description of the shingling process in this blog post: http://digfor.blogspot.com/2013/03/fruity-shingles.html . As discussed in the tip of the night for January 16, 2015 document shingling involves comparing n-grams of overlapping word sequences in two different text files. Ross notes that shingling involves of the calculation of Jaccard Similarity, "the number of items in the intersection of A and B divided by the number of items in the union of A and B" or

Sim(A,B) = |A ∩ B |

______

|A ∪ B |

. . . so we get a figure based the number of n-grams the two have in common divided over the total number of unique n-grams used in both.

Here's an example.

1. In Fig. 1 we see 3 text files, which are edited over the period of several weeks. The August version is almost the same as the July version, but one phrase has been moved around. In the September version while the original first sentence is still present in parts, an entirely new phrase has been added and more changes have been made.

2. In Fig 2., we run the n-gram generator as was discussed in on the night of January 16, 2015 , and copy out the three word overlapping n-grams for each of the three text files to an Excel spreadsheet.

3. In Excel the n-grams from each text file are pasted into columns A, C, and E, and then we run VLOOKUP formulas in column B to check which of n-grams from the July version in column A are the same as those in the August version in column C [18], and which of the n-grams from the August version in column C match those in column E for the September version [8].

4. On a second worksheet, we combined the n-grams from a July and August de-duped set, and an August and September de-duped set to get totals of 36 and 49 respectively.

5. So while the July and August versions of have a Jaccard similarity of 0.5, the August and September versions only have a Jaccard similarity of 0.16.

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page