TIFF: IFD and SubIFD
TIFF is a popular file format for professional images. One of the key features is being able to include multiple images within a single file. That does make it a little more complicated to write libraries for image processing, though. While working on some custom image services, I did a deep-dive into the TIFF specifications and practical usage of the file format. After quite some research and exploration, I've tried to summarize my learnings about TIFF and its multi-image features in this post.
- Image File Directory — the internal structure which supports multiple image
- Tags — how metadata is used to describe content and images
- SubIFD — another method for organizing and relating IFDs to one another
- Real-world Usage — examples of multi-image, pyramid, and industry-specific TIFF files
- General Notes — a few brief takeaways about TIFF subfiles
- Resources — links to official standards and related documentation
Image File Directory
Every TIFF file contains at least one Image File Directory (IFD) — a section within the file to hold the bytes of an encoded image along with its metadata. But, to be clear with the name, it's not a separate file nor directory on the filesystem. An IFD is also commonly called a subfile which is probably a better way to think about it.
I'll be using conceptual diagrams in this post to help visualize some of the topics. To start, a single-image TIFF file has a very basic structure with a single IFD.
@startuml
file page.tif {
folder IFD0 #ffffff {
label "Metadata (<&tag>)"
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
}
@enduml
Diagram of a basic TIFF file with a single image.
Some industries and use cases require multiple images within a single file. Since an IFD is a container to hold all the data needed for an individual image, an application can basically just write additional IFDs into a TIFF file.
There are a couple ways IFDs can be used. As the main images of a file, an extra piece of metadata, Next IFD, is included which tells applications where it can find the next IFD to read. This sequence of IFDs is called the main-IFD chain, and they are implicitly numbered based on the ordering, starting from zero.
@startuml
file multipage.tif {
folder IFD0 #ffffff {
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
folder IFD1 #ffffff {
rectangle "<&tag> Next IFD" as IFD1_Tag_NextIFD
rectangle "<&image> //Image Bytes//" as IFD1_Data
}
IFD0_Tag_NextIFD ..> IFD1
folder IFD2 #ffffff {
rectangle "<&image> //Image Bytes//" as IFD2_Data
}
IFD1_Tag_NextIFD ..> IFD2
}
@enduml
Diagram of a multi-image TIFF file.
By following the main-IFD chain (highlighted in white on these diagrams), applications can quickly seek through a TIFF file to whichever image(s) it needs. Externally, if a caller wants to reference an image from this main-IFD chain, it will reference the IFD based on its sequence number. For example, the first and default image of a TIFF file is always IFD0, and the third may be referenced as IFD2.
It's worth noting that not all applications support TIFF files with multiple images (it is considered an optional feature of the specifications). In these cases, an application will typically stop reading the file after it reads the first image (IFD0).
Tags
A piece of metadata, such as Next IFD, is officially named a tag. These tags are recorded within the header of an IFD and act as a kind of key/values lookup to further describe the IFD. Official TIFF specifications define common tags which range in purpose from low-level details (such as dimensions and byte offsets) to user-facing properties (such as copyrights and descriptions). Below are some examples referenced in this post.
- Next IFD, mentioned previously, is what links an IFD with a subsequent IFD. Without the tag, applications will generally assume the IFD is the last in a file and stop processing. The value is the byte offset for the start of the target IFD within the file.
- Subfile Type helps classify the image with a few common markers:
- Full-resolution – indicates the image is the fullest resolution available.
- Reduced-resolution – indicates the image has been resized down from full-resolution (which, typically, can also be found in the file).
- Single page of multiple – indicates the image is one of many and there are typically other sibling IFDs within the file.
- And there are several other official and vendor-conventional values that might appear, such as Transparency mask.
- Width and Height (technically, Length) are generic tags that should always be present and describe the dimensions of an image (in pixels).
- Page Number may be present when there is a particular ordering to multiple pages. Although IFDs are ordered within the TIFF file, that ordering shouldn't be relied on for user-facing information.
- Image Description may be a user-supplied description about the subject of an image. Sometimes this is short and a literal description (e.g. company picnic), but other times this is used for application-specific metadata and may not be human-readable.
Below is a more detailed version of the multipage.tif
diagram shown earlier, but with the addition of some conventional tags.
@startuml
file multipage-detail.tif {
folder IFD0 #ffffff {
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
rectangle "<&tag> Width\n**612**" as IFD0_Tag_Width
rectangle "<&tag> Height\n**792**" as IFD0_Tag_Height
rectangle "<&tag> Subfile Type\n**Full-resolution**\n**Single of multiple pages**" as IFD0_Tag_SubfileType
rectangle "<&tag> Page Number\n**1**" as IFD0_Tag_PageNumber
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
folder IFD1 #ffffff {
rectangle "<&tag> Next IFD" as IFD1_Tag_NextIFD
rectangle "<&tag> Width\n**612**" as IFD1_Tag_Width
rectangle "<&tag> Height\n**792**" as IFD1_Tag_Height
rectangle "<&tag> Subfile Type\n**Full-resolution**\n**Single of multiple pages**" as IFD1_Tag_SubfileType
rectangle "<&tag> Page Number\n**2**" as IFD1_Tag_PageNumber
rectangle "<&image> //Image Bytes//" as IFD1_Data
}
IFD0_Tag_NextIFD ..> IFD1
folder IFD2 #ffffff {
rectangle "<&tag> Width\n**612**" as IFD2_Tag_Width
rectangle "<&tag> Height\n**792**" as IFD2_Tag_Height
rectangle "<&tag> Subfile Type\n**Full-resolution**\n**Single of multiple pages**" as IFD2_Tag_SubfileType
rectangle "<&tag> Page Number\n**3**" as IFD2_Tag_PageNumber
rectangle "<&image> //Image Bytes//" as IFD2_Data
}
IFD1_Tag_NextIFD ..> IFD2
}
@enduml
Diagram of a multi-page TIFF file, including some tags.
SubIFD
Where the Next IFD tag creates an association to a sibling IFD for the file, the SubIFD tag creates an association from an IFD to a child IFD(s). Typically, the top-level/main IFDs are used for different but related images, while SubIFDs are used to add supporting or alternate representations of a specific image. A sequence of child IFDs is called a SubIFD chain, and they, too, are implicitly numbered based on their order within the chain, starting from zero.
Between the Next IFD and SubIFD tags, there are several internal layouts that a TIFF file may use for inter-related images. In practice, only one of these approaches is used within a given file, but, in theory, they are not mutually exclusive.
TIFF Trees
The standard approach, named TIFF Trees by the specification, uses the SubIFD tag with values pointing to one or more IFDs. In this case, an application can read the SubIFD tag once before seeking to any of the listed IFDs.
@startuml
folder IFD0 #ffffff {
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
rectangle "<&tag> SubIFD" as IFD0_Tag_SubIFD
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
folder IFD1 #ffffff {
rectangle "<&tag> SubIFD" as IFD1_Tag_SubIFD
rectangle "<&image> //Image Bytes//" as IFD1_Data
}
IFD0_Tag_NextIFD ..> IFD1
folder IFD as IFDm0s1 {
rectangle "<&image> //Image Bytes//" as IFDm0s1_Data
}
IFD0_Tag_SubIFD ..> IFDm0s1 : 0
folder IFD as IFDm0s2 {
rectangle "<&image> //Image Bytes//" as IFDm0s2_Data
}
IFD0_Tag_SubIFD ..> IFDm0s2 : 1
folder IFD as IFDm0s3 {
rectangle "<&image> //Image Bytes//" as IFDm0s3_Data
}
IFD0_Tag_SubIFD ..> IFDm0s3 : 2
folder IFD as IFDm1s1 {
rectangle "<&image> //Image Bytes//" as IFDm1s1_Data
}
IFD1_Tag_SubIFD ..> IFDm1s1 : 0
folder IFD as IFDm1s2 {
rectangle "<&image> //Image Bytes//" as IFDm1s2_Data
}
IFD1_Tag_SubIFD ..> IFDm1s2 : 1
folder IFD as IFDm1s3 {
rectangle "<&image> //Image Bytes//" as IFDm1s3_Data
}
IFD1_Tag_SubIFD ..> IFDm1s3 : 2
@enduml
Diagram of a TIFF file with multiple IFDs, each with several SubIFDs.
Next [Sub]IFD
Another approach relies on both the SubIFD and Next IFD tags. A chain is started with a standard SubIFD tag pointing to a child IFD; but, from there, the Next IFD tag is used to continue the chain. In this case, an application would read the SubIFD tag once, and then sequentially seek through child IFDs until the Next IFD tags stop (similar to how the main-IFD chain would be processed).
@startuml
folder IFD0 #ffffff {
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
rectangle "<&tag> SubIFD" as IFD0_Tag_SubIFD
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
folder IFD1 #ffffff {
rectangle "<&tag> SubIFD" as IFD1_Tag_SubIFD
rectangle "<&image> //Image Bytes//" as IFD1_Data
}
IFD0_Tag_NextIFD ..> IFD1
folder IFD as IFDm0s1 {
rectangle "<&tag> Next IFD" as IFDm0s1_Tag_NextIFD
rectangle "<&image> //Image Bytes//" as IFDm0s1_Data
}
IFD0_Tag_SubIFD ..> IFDm0s1
folder IFD as IFDm0s2 {
rectangle "<&tag> Next IFD" as IFDm0s2_Tag_NextIFD
rectangle "<&image> //Image Bytes//" as IFDm0s2_Data
}
IFDm0s1_Tag_NextIFD ..> IFDm0s2
folder IFD as IFDm0s3 {
rectangle "<&image> //Image Bytes//" as IFDm0s3_Data
}
IFDm0s2_Tag_NextIFD ..> IFDm0s3
folder IFD as IFDm1s1 {
rectangle "<&tag> Next IFD" as IFDm1s1_Tag_NextIFD
rectangle "<&image> //Image Bytes//" as IFDm1s1_Data
}
IFD1_Tag_SubIFD ..> IFDm1s1
folder IFD as IFDm1s2 {
rectangle "<&tag> Next IFD" as IFDm1s2_Tag_NextIFD
rectangle "<&image> //Image Bytes//" as IFDm1s2_Data
}
IFDm1s1_Tag_NextIFD ..> IFDm1s2
folder IFD as IFDm1s3 {
rectangle "<&image> //Image Bytes//" as IFDm1s3_Data
}
IFDm1s2_Tag_NextIFD ..> IFDm1s3
@enduml
Diagram of a TIFF file with multiple IFDs, each with a secondary SubIFD chain.
Common IFD
Finally, consider the following example which demonstrates how tags are simply references to IFDs within the file. In the case where two IFDs need to refer to the exact same image, such as a transparency mask, the SubIFD tags may end up referring to a single IFD.
@startuml
folder IFD0 #ffffff {
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
rectangle "<&tag> SubIFD" as IFD0_Tag_SubIFD
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
folder IFD1 #ffffff {
rectangle "<&tag> SubIFD" as IFD1_Tag_SubIFD
rectangle "<&image> //Image Bytes//" as IFD1_Data
}
IFD0_Tag_NextIFD ..> IFD1
folder IFD as IFDm0s1 {
rectangle "<&image> //Image Bytes//" as IFDm0s1_Data
}
IFD0_Tag_SubIFD ..> IFDm0s1 : 0
IFD1_Tag_SubIFD ..> IFDm0s1 : 1
folder IFD as IFDm0s2 {
rectangle "<&image> //Image Bytes//" as IFDm0s2_Data
}
IFD0_Tag_SubIFD ..> IFDm0s2 : 1
folder IFD as IFDm1s1 {
rectangle "<&image> //Image Bytes//" as IFDm1s1_Data
}
IFD1_Tag_SubIFD ..> IFDm1s1 : 0
@enduml
Diagram of a TIFF file with multiple IFDs, SubIFDs, and a common SubIFD.
Real-world Usage
TIFF files, with their IFDs and SubIFDs, can be written and structured using a wide range of styles.
Multi-page TIFF
In multi-page TIFF files, independent but related images are encoded in a single file. For example, a camera capturing a 'burst' of images, or a document scanner digitizing the front and back of a page. The structure of the file will typically be a basic main-IFD chain.
@startuml
file multipage.tif {
folder IFD0 #ffffff {
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
rectangle "<&tag> Page Number\n**1**" as IFD0_Tag_PageNumber
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
folder IFD1 #ffffff {
rectangle "<&tag> Page Number\n**2**" as IFD1_Tag_PageNumber
rectangle "<&image> //Image Bytes//" as IFD1_Data
}
IFD0_Tag_NextIFD --> IFD1
}
@enduml
Diagram of a multi-page TIFF file with sequential pages.
Pyramid TIFF
In a Pyramid TIFF, the first IFD is a full-resolution image with additional IFDs for smaller versions of the image. Pyramid files, in general, are used for high-performance publishing environments since the extra, downscaled images avoid slow resize operations. Pyramid-aware use the Subfile Type, Width, and Height tags to identify which IFD is most appropriate.
@startuml
file pyramid.tif {
folder IFD0 #ffffff {
rectangle "<&tag> Subfile Type\n**Full-resolution**" as IFD0_Tag_SubfileType
rectangle "<&tag> Width\n**4096**" as IFD0_Tag_Width
rectangle "<&tag> Height\n**2048**" as IFD0_Tag_Height
rectangle "<&image> //Image Bytes//" as IFD0_Data
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
}
folder IFD1 #ffffff {
rectangle "<&tag> Subfile Type\n**Reduced-resolution**" as IFD1_Tag_SubfileType
rectangle "<&tag> Width\n**2048**" as IFD1_Tag_Width
rectangle "<&tag> Height\n**1024**" as IFD1_Tag_Height
rectangle "<&image> //Image Bytes//" as IFD1_Data
rectangle "<&tag> Next IFD" as IFD1_Tag_NextIFD
}
IFD0_Tag_NextIFD --> IFD1
folder IFD2 #ffffff {
rectangle "<&tag> Subfile Type\n**Reduced-resolution**" as IFD2_Tag_SubfileType
rectangle "<&tag> Width\n**1024**" as IFD2_Tag_Width
rectangle "<&tag> Height\n**512**" as IFD2_Tag_Height
rectangle "<&image> //Image Bytes//" as IFD2_Data
}
IFD1_Tag_NextIFD --> IFD2
}
@enduml
Diagram of a Pyramid TIFF file with multiple resolutions.
Command Line Example
The following commands create and then inspect a Pyramid TIFF using a large image from NASA (29163×8162).
$
vips tiffsave PIA25915.png PIA25915-pyramid.tif --pyramid
$
tiffinfo PIA25915-pyramid.tif | grep -e '^=' -e '^-' -e 'Image Width:'
=== TIFF directory 0 ===
Image Width: 29163 Image Length: 8162
=== TIFF directory 1 ===
Image Width: 14581 Image Length: 4081
=== TIFF directory 2 ===
Image Width: 7290 Image Length: 2040
=== TIFF directory 3 ===
Image Width: 3645 Image Length: 1020
=== TIFF directory 4 ===
Image Width: 1822 Image Length: 510
=== TIFF directory 5 ===
Image Width: 911 Image Length: 255
=== TIFF directory 6 ===
Image Width: 455 Image Length: 127
=== TIFF directory 7 ===
Image Width: 227 Image Length: 63
=== TIFF directory 8 ===
Image Width: 113 Image Length: 31
For single-image files, pyramids are often written using all top-level IFDs as shown in the previous diagram. However, they can also be structured using the SubIFD tag.
@startuml
file pyramid-subifd.tif {
folder IFD0 #ffffff {
rectangle "<&tag> Subfile Type\n**Full-resolution**" as IFD0_Tag_SubfileType
rectangle "<&tag> Width\n**4096**" as IFD0_Tag_Width
rectangle "<&tag> Height\n**2048**" as IFD0_Tag_Height
rectangle "<&tag> SubIFD" as IFD0_Tag_SubIFD
rectangle "<&image> //Image Bytes//" as IFD0_Data
}
folder IFD as IFDm0s0 {
rectangle "<&tag> Subfile Type\n**Reduced-resolution**" as IFDm0s0_Tag_SubfileType
rectangle "<&tag> Width\n**2048**" as IFDm0s0_Tag_Width
rectangle "<&tag> Height\n**1024**" as IFDm0s0_Tag_Height
rectangle "<&image> //Image Bytes//" as IFDm0s0_Data
}
IFD0_Tag_SubIFD --> IFDm0s0
folder IFD as IFDm0s1 {
rectangle "<&tag> Subfile Type\n**Reduced-resolution**" as IFDm0s1_Tag_SubfileType
rectangle "<&tag> Width\n**1024**" as IFDm0s1_Tag_Width
rectangle "<&tag> Height\n**512**" as IFDm0s1_Tag_Height
rectangle "<&image> //Image Bytes//" as IFDm0s1_Data
}
IFD0_Tag_SubIFD --> IFDm0s1
}
right footer [[https://dpb587.me/ dpb.io/tiff-ifd-subifd]], [[http://creativecommons.org/licenses/by-nc/4.0/ CC BY-NC 4.0]]
@enduml
Diagram of a Pyramid TIFF file using SubIFDs for subsampled resolutions.
Command Line Example
The following commands create and then inspect a Pyramid TIFF using a large image from NASA (29163×8162). In this example, the generated TIFF uses SubIFDs for the reduced-resolution images.
$
vips tiffsave PIA25915.png PIA25915-pyrsub.tif --pyramid --subifd
$
tiffinfo PIA25915-pyrsub.tif | grep -e '^=' -e '^-' -e 'Image Width:'
=== TIFF directory 0 ===
Image Width: 29163 Image Length: 8162
--- SubIFD image descriptor tag within TIFF directory 0 with array of 8 SubIFD chains ---
--- SubIFD 0 of chain 0 at offset 0x3571630c (896623372):
Image Width: 14581 Image Length: 4081
--- SubIFD 0 of chain 1 at offset 0x381dc55e (941475166):
Image Width: 7290 Image Length: 2040
--- SubIFD 0 of chain 2 at offset 0x38cbe790 (952887184):
Image Width: 3645 Image Length: 1020
--- SubIFD 0 of chain 3 at offset 0x38f8f9d2 (955840978):
Image Width: 1822 Image Length: 510
--- SubIFD 0 of chain 4 at offset 0x3905080c (956631052):
Image Width: 911 Image Length: 255
--- SubIFD 0 of chain 5 at offset 0x3908153e (956831038):
Image Width: 455 Image Length: 127
--- SubIFD 0 of chain 6 at offset 0x3909a228 (956932648):
Image Width: 227 Image Length: 63
--- SubIFD 0 of chain 7 at offset 0x390a6f02 (956985090):
Image Width: 113 Image Length: 31
This SubIFD approach enables multi-page TIFF files to include pyramid-style, downscaled versions for each page, too.
OME-TIFF
An OME-TIFF file is a good example of a domain-specific standard built on top of TIFF standards. The Open Microscopy Environment (OME) is a consortium of groups focused on standards for microscopy data — including images. In these cases, images are captured across several dimensional planes and a single TIFF file can include all of them via IFDs. However, the standard TIFF tags are insufficient for describing microscopy metadata of the IFDs, so OME-XML data is also embedded in the TIFF file using the Image Description tag.
@startuml
file ome.tif {
folder IFD0 #ffffff {
rectangle "<&tag> Image Description" as IFD0_Tag_ImageDescription
rectangle "<&tag> Next IFD" as IFD0_Tag_NextIFD
rectangle "<&image> Image Data" as IFD0_Data
}
folder IFD1 #ffffff {
rectangle "<&tag> Next IFD" as IFD1_Tag_NextIFD
rectangle "<&image> Image Data" as IFD1_Data
}
IFD0_Tag_NextIFD ..> IFD1
folder IFD2 #ffffff {
rectangle "<&tag> Next IFD" as IFD2_Tag_NextIFD
rectangle "<&image> Image Data" as IFD2_Data
}
IFD1_Tag_NextIFD ..> IFD2
label "..." as IFDnext
IFD2_Tag_NextIFD ..> IFDnext
note as IFD0_Tag_ImageDescription_Value #ffffff
<!-- ... -->
<Pixels ...
DimensionOrder="XYZTC"
SizeZ="3" SizeT="2" SizeC="2"
>
<TiffData />
</Pixels>
end note
IFD0_Tag_ImageDescription --> IFD0_Tag_ImageDescription_Value
map "IFD Dimension Map" as IFD0_Tag_ImageDescription_Value_TiffDataMap {
**IFD0** => Z=0, T=0, C=0
**IFD1** => Z=1, T=0, C=0
**IFD2** => Z=2, T=0, C=0
**IFD3** => Z=0, T=1, C=0
**IFD4** => Z=1, T=1, C=0
**IFD5** => Z=2, T=1, C=0
**IFD6** => Z=0, T=0, C=1
**IFD7** => Z=1, T=0, C=1
**IFD8** => Z=2, T=0, C=1
**IFD9** => Z=0, T=1, C=1
**IFD10** => Z=1, T=1, C=1
**IFD11** => Z=2, T=1, C=1
}
IFD0_Tag_ImageDescription_Value ..> IFD0_Tag_ImageDescription_Value_TiffDataMap
}
@enduml
Diagram of a TIFF file using OME-TIFF conventions.
General Notes
As a reminder, IFD-related tags are simply pointers to IFD locations within a TIFF file. Aside from recommendations, there is no technical enforcement about the kind of structure a TIFF file results in. When building libraries or lower-level tools, there are a couple things to keep in mind:
- It's important to understand, follow, and test any industry-specific conventions that may be used on top of TIFF. For example, usage of main-IFD vs SubIFD chains. Otherwise, applications may not be able to correctly enumerate TIFF subfiles.
- When working with untrusted files, it's even more important to avoid assumptions about how TIFF files and their IFD chains may be structured. In theory, those chains could be crafted to cause recursive loops, out of bound reads, or unexpected traversals.
One of my initial reasons for investigating was to figure out a canonical and practical way to reference images within TIFFs. It ended up more complicated than I expected due to the potential combinations of Next IFD + SubIFD tags used for traversal. For now, I decided to stay with ifd=N,subifd=M
-style references and only support access to main IFDs and their direct, SubIFD tag-based children. It keeps things simple, matches most other image processing libraries, and I think it is inline with changes that might be needed later for more comprehensive IFD access.
Resources
If you want to learn more, I found the following links most helpful.
- Baseline TIFF specification about the file format, standard tags, and other details.
- TIFF Technical Notes specification including details about SubIFDs and TIFF Trees.
- Multi Page / Multi Image TIFF documented by the
libtiff
library used to create TIFF files.