top of page

Fred Grevin, the Vice President of Records Management for the New York City Economic Development Corporation, has made a study of whether or not it's cheaper to scan documents or store them in hard copy. Fred is a principal of the Technical Committee for National Fire Protection Association (for its Standard for the Protection of Records) and he was also an instructor at Columbia University in a course on Record Management. He's something of an iconoclast against the conventional wisdom that's better in the long term to scan documents, and he has a spreadsheet to prove it.

I've posted his spreadsheet below which is now available online for the first time. You see a PDF verison of Fred's study here.

The variables on the spreadsheet are entered in the yellow cells on the worksheets named, 'Paper Storage', and 'Scanning', and formulas on other worksheets show the relative costs of each approach over a 35 year period. As you can see we can enter different costs for the price of a box; taking an inventory of a box; shipping it out and logging it; and storing it for a year. Fred gives optimistic, realistic and pessimistic cost alternatives. The scanning worksheet let's you enter different prices for doc prep; scanning per page; QCing; and indexing. With the default values entered in Fred's spreadsheet you can see that even with a fairly small number of boxes in the most optimistically cheap scanning scenario, scanning only costs less after 20 years. Fred's comments on the first worksheet acknowledge that he is not considering the advantage of being able to get scanned images on demand.

It's important to note that baked into Fred's study is the premise that each box (presumably we're considering the standard 15 x 12 x 10 banker's box) will hold 1200 pages. This may be about right if your box has several files or redwelds, but my experience indicates that a box can hold as many as 3500 - 4000 letter size pages. You may want to vary the formula in cell B11 of the paper storage field with specifies the 1200 page per box count.



Law firms and companies often ship box to off-site storage facilities. It's standard practice to track these boxes with bar codes. If you need to label and index a lot of these boxes, you may be stuck with affixing the labels to the boxes, but you won't have to actually key in the bar code numbers to a digital index or write them down on a hard copy.

You can find Beep on the app store, which does a great job of scanning in bar codes. I have tested it on bar coded boxes, and household products and it picks up bar codes with no trouble at all using your iPhone camera.

The best thing about this app is that it will collect bar code numbers one right after the other as you scan different labels and prepares a list.

The app has the option to email an Excel .csv file. You can easily input the bar codes of dozens and dozens of boxes in an index in a fraction of the time it would take you to key them.



Here's a continuation of my postings about the Electronic Discovery Institute's online course, that you can subscribe to for just $1. I last blogged about this course on November 16, 2016. Class 3 is oddly omitted from the site. Go to https://www.lawinstitute.org/ to sign up for the course.

This is an outline of Class 4 an "Introduction to Information Technology: Collaboration & Workspaces".

The presentation included statements by Patrick Oot, a partner at Shook, Hardy & Bacon LLP; Nisha DeSilva, the Chief Technology Officer for Microsoft Corporate External & Legal Affairs (CELA); Paul Meyer, a managing counsel for Willis Towers Watson; and Patrick Kessler the Executive Director of Information Governance at UBS Warburg.

Oot, discussed the evolution of file shares to collaborative workspaces or e-Rooms such as SharePoint. Companies are using services such as Yammer to set up internal social networks.

DeSilva discussed the use of mega data centers in the cloud., and how these have been used by companies to change the way people collaborate with each other. Meyer talked about how Word documents can be kept on community servers - where a work force uses an organized document management process.

DeSilva referred to how data can accumulate in an unstructured fashion, as in an email inbox. Enterprises have sought to use document management systems to store information previously kept on file shares. A legal memo will be stored with metadata so other users have more 'smart information' about it. Such ollaboration technologies picked up in the mid 1990s and have now become more widespread.

Oot advised companies facing a hold to look into collaborative workspaces. These make it easier to collect data. Content is associated with a person. Records can be scheduled for disposition with such document management systems.

Meyer said that tools are becoming more robust. SharePoint can be used to save data as people work on the same document. Its instant messaging feature can be used to exchanged comments. Companies must have strategies to capture SharePoint drafts when a hold is implemented, or they can be overwritten.

DeSilva said e-discovery has changed in the interconnected world. Files are no longer collected individually. This has changed with cloud technologies. Office 365 has an in-place e-discovery solution.

Oot talked about how companies are using cloud based platforms such as Google Docs. This involves accessing information on remote servers. Applications are migrating to the cloud, and documents are saved their as well. Overhead on the individual device has become less and less. Cloud based software is more cost effective.

Kessler said that companies initially came up with network drives to store files; used file shares to exchange data; and had databases that can be accessed by multiple users. These systems result in many different data silos - and can become separate apart throughout different areas of an organization; different geographical regions; or as system are replaced. SharePoint was designed to allow people to share email more easily - they can use the same inbox and share calendars. Portals can exist for a team to see common content and allow access to the same files in straight forward way.

DeSilva asserted that SharePoint had challenged other document management systems. It has become the world's top collaboration platform. SharePoint 2010 was the pivotal point. Technology allowed people to work with co-workers and clients very easily. You can build custom apps within SharePoint. SharePoint was implemented with a lot of security to store important knowledge.

Kessler discussed how very important it is to access information from any device. Companies refer to this as the 'Mobile First' policy. Cloud First means that you don't have to hunt for data, don't have to do multiple searches.

Kessler also noted how Google Docs is similar to Office 365. With these systems data is not necessarily in the company's control. Corporations use platforms like Jive to host data internally. Jive has a Facebook like feel; its an interactive intranet.

Kessler observed that with cloud system data location is not transparent. Users can have the data where ever they go, but it can be distriuted and split apart even if it appears to be together. There may be issues retrieving from other countries.

Meyer said that if companies are not using cloud they have to create the cloud internally. This involves creating several server farms. If company can use Amazon's email service they don't need to do the maintenance. Companies with high data security obligations can't necessarily make use of the cloud. Amazon uses hundreds of server farms around the world. It's not possible to know where the data is kept. In USA if data is regulated under HIPAA you may get questions about where data is kept. It's beneficial to know if it is in the jurisdiction of a particular states.

Kessler noted that the cloud provides economies of scale. Cloud vendors will have support staff and their own security models.

Meyer remarked on how it is possible to manage e-discovery through the cloud and this has cost advantages. In litigation, it is important to keep a pristine set of data. One has to share documents so they can be reviewed by outside counsel, so you can post data to secure sites -but then you have to maintain the environment. Restriction controls are useful feature of most cloud systems. For example it's possible to prevent highly confidential documents from being downloaded.

Oot said that people are increasingly concerned about moving documents to the cloud. The business model of cloud providers is based on data security. Encryption at file level is often offered.

Kessler felt that using the cloud comes with a cost - giving up control. How is the data maintained? Is it ever really deleted? Are backups retained? There's not a lot of transparency as to what is happening with data.

Meyer noted that if any system is implemented it has to make sense for the business. With the implementation of records management policies, everyone in the company needs to have a seat at the table. Records retention is not defensible if it exists only to server lawyers. The most important question to ask is when data is not useful anymore. In the US because of tax laws companies may have to keep certain documents for 7 years. In Europe tax documentation may be kept for 10 years. Systems that are too complicated to learn are not acceptable.

Kessler wanted companies to make rules to make sure data is appropriately managed. He also noted that in the field of data protection, serving the client's right to be forgotten may be possible in-house but not with a third party provider.

Meyer remarked about the need for tools that will make it important for people to do the right thing. Willis Tower Watson is the largest actuarial service in the world. When it executes a litigation hold -automatic questionnaires are fed to people. It is always testing to make sure they know where data is; if there are deviances, they are brought up to an information committee.

Oot brought up the issue of data remediation policies. If there is a big network share with several terabytes of data -there is very little ownership of data. New employees don't often take ownership of former data. This is a potential litigation liability. Information governance should try to put retention periods around files. Don't have everyone saving everything.

Oot noted that unified messaging has been adopted by some companies. Faxes, voicemail, email are all in one system.

Oot also covered litigation readiness. It's not just about getting information out of a system for litigation, it's about how to target regulatory or business needs. While early on in the age of social media, screenshots of Facebook profiles were used in cases, now there is a tool to download Facebook pages to .xml files.

Oot urged companies to think of business needs when deploying technology tools. Ask how long do you need retain a document to maintain your business? Are there federal agency regulatory requirements.? When litigation comes up, how is the information retained off schedules? There are two separate tracks.

Meyer gave Lync as example of a platform for unified messaging. It's more than IMing. It can used for live video communications; internal presentations; and then be used to save this information.

Meyer concluded by noting how Important it is not to over preserve. Global data is expanding at such a rate organizations will be buried in it.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page