top of page

Relativity Repeated Content Identification - From the Bottom Up


Relativity Structured Analytics can identify repeated content (such email disclaimers) in order to improve searchable document indexes. You want to focus on the authored content of documents and avoid having to sort through search results with hits in boilerplate language. However, if you want an admin to find this repeated content there's a significant limitation

While repeated content can be found by manually entering language that you know to search for, in Relativity 9.6, under Indexing & Analytics . . . Structured Analytics Set, you can use the repeated content operation to automatically find repeated content in a set of documents in a saved search. Each segment of repeated content must have a word count range designated by the admin, and appear at least a set number of times also designed by the admin. Each segment must also be a certain number of lines - often no more than 4.

A key thing to keep in mind about repeated content identification is that Relativity structured analytics will only search for repeated content from the bottom of documents. Disclaimers listed in headers, or boilerplate language repeated through a document will not be identified int the automated search. Accordingly, the admin must set a number of tail lines to search. The tail lines are the number of non-blank lines from the bottom of a document.

Relativity recommends setting 'Number of tail lines to analyze' to 16. The max is 200. Increasing the setting much above 16 will cause the operation to run for a long time.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

​

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

​

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page