top of page

Getting Bates Numbers Right: Hyphens and Hyphen-minuses


Recently I received a PDF of list of Bates numbers that I had to look up in a Relativity workspace. I converted the PDF to an Excel spreadsheet using Adobe Acrobat, and copied the column containing the Bates numbers into the field for the document identifier in Relativity. Unexpectedly, the Bates numbers which contained hyphens did not come up in the search results. What went wrong?

One of the Bates numbers from the exhibit list that I couldn't find was composed of these characters [note: these are not real Bates numbers, but the hyphens are]:

XYZ‐000818

However, the Relativity workspace did contain a Bates number as a document identifier that was composed of these characters:

XYZ-000818

What's the difference? It's hard to see but the hyphen or dash between the letter prefix and the six digit number is different. See this analysis in Excel:

The IF . . . THEN formula shows the two Bates numbers are not actually the same. The UNICODE formula reveals that the two hyphens are actually distinct characters.

Unicode 45 is the hyphen-minus character used in cell B1. Unicode 8208 is a regular hyphen used in cell A1. These characters can be read differently by document review platforms and text editors.

Be sure to also account for en dashes – (unicode 8211 entered with ALT CODE 0150) and em dashes — (unicode 8212 entered with ALT CODE 0151).


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page