top of page

This past October, Craig Ball, an electronic discovery special master and a professor at the University of Texas at Austin, published a succinct guide to electronic discovery from mobile devices, Mobile to the Mainstream. Here's an even more succinct summary of the guide.

The average user of a smartphone spends 4 hours on the device each day. Two-thirds of all emails are sent using phones. Ball criticizes lawyers for treating mobile phones as devices that are only used for making phone calls. Smartphones do not necessarily need to receive special forensic reviews. Only active data should be collected, not latent artifacts. A cheap tool can be used to collect documents, spreadsheets, and presentations from mobile phones. Photos and videos can be collected with ease. Photos will be stored on most mobile devices in High Efficiency Image File Format. This files will have an .heic extension. Many e-discovery tools can't support this format.

Text messages are used for digital communications more frequently than emails. Unless the default setting is changed, an iPhone will keep its text messages indefinitely. Exporting messages and their attachments is not burdensome. The encoding should be changed to Unicode UTF-8 to capture emojis. An iPhone's call history and voicemail metadata can also be easily exported. Mobile calendar data is easier to export and redact than the same type of data found in .pst archives.

Data from apps can be stored as JSON, PLIST and SQLITE files. Geolocation data may be difficult to collect and review even if a user can access it easily. Federal law requires phones to broadcast their location in order to facilitate responses to emergency calls. Apple protects geolocation data and won't allow for a bulk export of the data. This data is also not backed up or stored in iCloud.

Ball has not been able to find a forensic tool that can collect data from multiple images of multiple smartphone simultaneously. He has used a $50 tool called iMazing effectively in mobile e-discovery. His mobile discovery scorecard tracks the difficulty of collecting and reviewing different types of evidence, and its potential relevance.



In 2010 the American Society for Information Science and Technology published a study of efficacy of the 'machine categorization'. See, Herbert L. Roitblat, Anne Kershaw, and Patrick Oot, Document Categorization in Legal Electronic Discovery:Computer Classification vs. Manual Review, 61(1) J. Assoc. Inf. Sci. Technol. 70–80 (2010). The study compared the results of a document review performed by 225 attorneys in response to a Second Request from the Department of Justice under the Hart-Scott-Rodino Act (concerning an investigation of the acquisition of MCI by Verizon) with the results of automated categorization performed by two e-discovery vendors.

225 attorneys performed the initial review in two teams - one looking for privileged documents and the other looking for relevant documents. Two re-review groups (Teams A & B) both reviewed the same 5,000 documents in order to create seed sets for two different e-discovery vendors, one based in California and the other in Texas. The vendors operated without knowledge of the other's results. The data consisted of 2.3 million documents totaling 1.3 TB that were collected from 83 custodians. Only 1.6 million documents remained after duplicates were eliminated. The original team of attorneys took 4 months to identify 176,000 documents for production. The cost of the review was $13.5 million.

This table compares the coherence between the various studies. [Apparently there's a typo and the last row should refer to 'Original v. System D']. A Verizon attorney, Patrick Oot, one of the authors of the study, 'adjudicated' decisions made by the reviewers for the seed sets deciding which one was right when they disagreed. The confirmation by this subject matter expert lends credence to the finding that in the set of 5,000 the original team missed 739 relevant documents.

The results show that computer assisted review performed by the vendors (System C and System D) marked large numbers of documents as responsive which the original team of 225 attorneys found to be non-responsive - about 200K by both systems. They also found that about half of the documents produced after the original review were non-responsive.

There was more likely to be agreement between the teams on which documents were non-responsive than as to which were responsive.

The levels of precision and recall for the systems used by the vendors were relatively low, but the study concluded that the automated review systems yielded results at least as good as those of the human review team.


  • Dec 5, 2018

Dup Scout is a handy tool for electronic discovery professionals. It allows you to search for duplicate files on local and network drives. A trial version can be downloaded here.

Click on 'Duplicates' on the upper toolbar. You'll be given the chance to change the default options and choose to search for any one of five hash values.

Select a specific directory, and Dup Scout will search for duplicate files, provide a summary of the size taken up by each subdirectory and generate a list of how many files are present in each format.

Click on 'Charts' on the toolbar, and Dup Scout will display a pie chart showing the number of duplicates for each file extension.

Happy E-Discovery Day!


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page