top of page

In last night's tip we discussed how electronic files have unique file headers which can be used to identify the format of a file if an extension is missing. Electronic files also have footers, or trailers, which are unique to specific formats. While these footers are not referenced in the Wikipedia index linked to in last night's tip, you can find of some of these in a hex / ascii signature index posted to the site of Gary Kessler, a forensic and information security consultant.

So if we return to the file examined in last night's tip and open it in a hex editor, we can see that it ends with the bytes FF D9. In the second column, which lists the ASCII text, we can see the last characters are ÿÙ .

If we consult Kessler's index, we find this entry for a JPEG file. The values match up.


If you run into a file without an extension, there are a couple of ways to determine what format the file is actually in. A Hex editor is a tool for examining the binary data that compromises an electronic file - it lets you literally edit the bytes that comprise a computer file. A free Hex editor, HxD, can be downloaded here, https://mh-nexus.de/en/hxd/ . Bytes at the beginning of a file, referred to as its magic number, can pinpoint the format of a file. The magic number appears in the 'file header' of an electronic file. Wikipedia has a list of file signatures which show the bytes used in the beginning of a wide range of commonly used file formats. See https://en.wikipedia.org/wiki/List_of_file_signatures and make reference to the first column for 'Hex signature'.

Let's see how this first approach works in practice.

1. As shown in the figure below, I have a small file which is saved with no extension.

2. When this file is opened in HxD we can see on the first line of data there is a list of two character letter and number pairs.

3. Copy the first four or five pairs and search for them on the Wikipedia index. You'll see that the pairs for this file, 'FF D8 FF E0' pull up this entry on the index.

4. The unknown file is a JPEG file. We simply need to add a 'jpg' extension to open the file.

Another approach is to review the unknown file in a common text editor such as NotePad. The Wikipedia index includes a column entitled 'ISO-8859-1'. This column shows how the headers of common file formats are interpreted in this widely used ASCII text encoding.

When we open the uknown file in NotePad, we want to focus on the first several characters, ÿØÿà . These match those listed on the index for JPEG files.

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page