There has been a lot of discussion lately around image file formats. So, I am repurposing an article that I wrote and was published in the January/February 2006, AIIM E-DOC Magazine.
In this post we will discuss image file formats in general and specifically hone in on TIFF. In a subsequent post, we will deal with other image file formats. Both of these posts will be from a standards basis.
Image file format standards have helped the widespread adoption of imaging technology take place. Image file formats provide a standardized method of organizing and storing image data. A scanned document or image consists of picture elements or pixels that represent the brightness and color of the information on the page. There are numerous graphic and image file formats such as TIFF, JPEG and PDF which are three of the standards used in document imaging. (JPEG and PDF will be dealt with in the next post.)
Let's start by discussing compression.ANSI/AIIM TR 33, Selecting an Appropriate Image Compression Method to Match User Requirements provides an explanation of compression algorithms and useful information in selecting the best compression algorithm for your application. It is important to note that one compression method is not applicable for all scanned documents and in choosing the best method, one must consider the type of document that will be scanned. Compression scheme is the method used to reduce the amount of data needed to store or transmit a representation of an image. Compression is lossless when the data is compressed by efficient coding of the information in the image and where the reconstructed image contains the same amount of information. Lossy compression is where images are compressed by selectively removing information from the image. This does not mean that words, phrases or sentences are removed. Through complex algorithms statistically redundant information as well as perceptually irrelevant or unimportant information is removed leaving only the useful information.
TIFF, Tagged Image File Format is a file format used mainly for storing raster images, including photographs and line art and is largely credited with founding the imaging industry. Aldus (acquired by Adobe Systems, Inc.) is credited with developing TIFF for use with PostScript printing. It is now widely used for images along with JPEG. TIFF's primary goal is to provide a rich environment within which applications can exchange image data. This richness is required to take advantage of the varying capabilities of scanners and other imaging devices.
TIFF uses tags to handle multiple images and data in a single file. These tags describe the size of the image or define how the image data is arranged and identifies the compression algorithm if any that is used. Images created using TIFF can be used for archiving purposes as TIFF is a lossless format which means that the file may be edited and saved without losing any compression.
In document management, TIFF is used in conjunction with CCITT Group IV compression typically used with facsimile technology. Typically, black and white documents are captured using TIFF; however, color may also be used. In large volume applications, documents are typically scanned in black and white, rather than color or grayscale to conserve on the file size. Because TIFF supports multiple pages, a multi-page document can be scanned to a single file rather than an individual file for each page scanned.
There are variations of the TIFF specification developed by Aldus and controlled by Adobe Systems Inc. TIFF is a viable file format. As you think about file formats, the question really becomes what file formats will be around in the years to come. Today's file formats are PDF, TIFF, XML,etc. Will they be here in 50 years? What new one will be 'The File Format' that everyone will be talking about.
The next blog post in this mini-series will deal with JPEG and PDF and the characteristics of sustainable file formats.