How to Use Clustering to Expedite Document Review
top of page

How to Use Clustering to Expedite Document Review


Here's a follow-up to the Tip of the Night for August 9, 2019, which discussed using clustering to help expedite document review. It employs an approach similar to that discussed in the Tip of the Night for August 7, 2019 on Double-checking Responsive Coding in Relativity with Cluster Visualization. Follow these steps:

1. On the Documents, select the top level folder in the browser to use all of the documents in the workspace, or choose a subfolder containing the set of documents you need to focus on.

2. In the mass operations menu at the bottom left, select 'Cluster'.

3. Select all of the documents in the review folder.

4. Click on Cluster, and then submit the documents for clustering.

5. Name the cluster and then select an analytics index which contains all of documents being clustered.

6. Advanced clustering options may be set for maximum hierarchy depth (the number of cluster levels - a setting of 1 would only have top level clusters); minimum coherence (the amount of conceptual correlation in each cluster - the default being .7); and generality (how specific the clusters will be at each level).

7. Note that after a cluster operation is run a new field will be created listing the names of the clusters each document has been assigned to. A cluster score is also included. These fields can easily be used as search conditions.

8. Run a new search which has the Cluster name field operator as "is set".

9. Save the search with the Edit; File Icon; and control number fields selected.

10. Under Case Administration, on the Batch tab click 'New Batch Set'. Designate a number of batches to be included in each set, and choose a prefix for the batch number.

11. Choose the saved search as the Batch Data Source.

12. Save the batch, and then run 'Create Batches' on the console.

The resulting batches can be assigned to the reviewers. Relativity clusters documents with conceptual similarities, so it may be possible to bulk code the batches.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page