top of page

PowerShell script to count words, lines, and characters in multiple PDFs


Laxman Singh has posted a PowerShell script here which will count the number of lines, words, and characters in multiple PDF files.

1. Open PowerShell ISE and use the cd change directory command to select the folder containing the PDFs you want to review.

2. Enter the below script, and then simply press return.

dir -Include *.* -Recurse | % { $_ | select name, @{n="characters";e={ get-content $_ | measure-object -character | select -expa characters } } , @{n="words";e={ get-content $_ | measure-object -word | select -expa words } } , @{n="lines";e={ get-content $_ | measure-object -line | select -expa lines } } } | ft -AutoSize

3. A table will be generated listing the total character, word and line count for each PDF file.

This script is a great way to detect near duplicate PDF files.


Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

​

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

​

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page