EXIF Thumbnails and the AIR Bus Tour

I don’t know what it is about EXIF information in JPEG files that so interests me, but I find the related specifications irresistibly attractive. There’s something about being able to see “secret” information inside what most just view as an image file. In many cases, among those secrets is an actual JPEG thumbnail of the full-size image. For this exercise, I broke out my Drogan’s decoder ring and set out to extract that thumbnail.

There are a lot of little steps here, but the first of those is to make sure that you’re dealing with a JPEG, that there’s metadata in the file, and that the metadata follows the EXIF format.

// Is the file a JPG file?
if( ( stream.readUnsignedByte().toString( 16 ) +
	  stream.readUnsignedByte().toString( 16 ) ) != SOI_MARKER )
{
	return false;
}

// Does the file contain meta data?
if( ( stream.readUnsignedByte().toString( 16 ) +
	  stream.readUnsignedByte().toString( 16 ) ) != APP1_MARKER )
{
	return false;
}	

app1 = new Object();
app1.start = stream.position;
app1.size = stream.readUnsignedShort();

// Does the file contain EXIF data?
if( stream.readMultiByte( 4, air.File.systemCharset ) != "Exif" )
{
	return false;
}

stream.position = stream.position + 2;

align = new Object();
align.position = stream.position;
align.type = stream.readMultiByte( 2, air.File.systemCharset );

// Determine byte alignment
if( align.type == INTEL_ALIGN )
{
	stream.endian = air.Endian.LITTLE_ENDIAN;
} else if( align.type == MOTOROLA_ALIGN ) {
	stream.endian = air.Endian.BIG_ENDIAN;
} else {
	return false;
}

stream.position = stream.position + 2;

Once you’ve got that down, the next step is to track where you are in the byte stream, and how far ahead you need to seek to access the data you’re interested in accessing. This is important because EXIF offset are specified from the start of the marker, not the file. Using the FileStream.position property however refers to the start of the file.

// Get IFD offset
ifd0 = new Object();
ifd0.position = align.position;
ifd0.offset = stream.readUnsignedInt();

entries = stream.readUnsignedShort();

// Jump to end of IFD0
stream.position = stream.position + ( 12 * entries );	

// Get offset to IFD1 (thumbnail)
ifd1 = new Object();
ifd1.offset = stream.readUnsignedInt();
ifd1.position = ifd1.offset + align.position;

stream.position = ifd1.position;
entries = stream.readUnsignedShort();

The JPEG thumbnail is stored a IFD1, and there’s not much else that there. The IFD includes records for the offset of the JPEG thumbnail, the number of bytes in the JPEG, etc. This is obviously very helpful when we’re trying to extract the thumbnail image. With the offset and size information, you can tell the FileStream object to jump ahead and read the bytes into a ByteArray.

thumb = new Object();

for( var e = 0; e < entries; e++ )
{
	// Get tag
	tag = pad( stream.readUnsignedByte().toString( 16 ), "0", 2 );
	tag = pad( stream.readUnsignedByte().toString( 16 ), "0", 2 ) + tag;				

	if( tag == JPEG_OFFSET )
	{
		stream.position = stream.position + 6;

		thumb.offset = stream.readUnsignedInt();
		thumb.position = align.position + thumb.offset;
	} else if( tag == JPEG_BYTE_COUNT ) {
		stream.position = stream.position + 6;
		thumb.size = stream.readUnsignedInt();
	} else if( tag == COMPRESSION ) {
		stream.position = stream.position + 6;
		thumb.compression = stream.readUnsignedInt();
	} else {
		stream.position = stream.position + 10;
	}
}

It’s interesting that the thumbnail is itself already encoded in JPEG format. Once you’ve extracted the ByteArray then, there’s no further processing that needs to take place. We can simply write the bytes to disk using File and FileStream objects. Specifically you’ll want to use the FileStream.writeBytes() method which takes a ByteArray as an argument.

jpg = new air.ByteArray();

// Move to offset and extract JPG
stream.position = thumb.position;
stream.readBytes( jpg, 0, thumb.size );

// Write the thumbnail file to disk
oimg = air.File.applicationResourceDirectory.resolve( "thumbs" +
	air.File.separator + files[index].name );
ostream = new air.FileStream();
ostream.open( oimg, air.FileMode.WRITE );
ostream.writeBytes( jpg, 0, 0 );
ostream.close();

You might be wondering about the practical application of such an endeavor. Imagine you wanted to create an image viewing application. You could start trying to load and scale all those great images from your fancy new seven-megapixel camera, but you’re going to hit a wall pretty quickly. In this example, random file access, coupled with asynchronous IO, delivers solid performance for a large number of images.

I gave this code to James Douma at Nitobi the other day and he rolled it into a real estate AIR application on which he’s been working. As I understand it, Andre Charland will actually be showing this application off a little on the onAIR Bus Tour, so be sure to check it out! If you do intend to stop by, please take a moment to register so we know how many to expect.

The full code for this example is attached, but at this point, I’m getting pretty bored with only extracting information. I think I’ll move onto trying to add information to the file next. Specifically I’m thinking about geocode/GPS information crossed with Yahoo! Maps for the onAIR Bus tour. Sound like an interesting project? Stop by one of the venues, and give me your input!

8 Responses to “EXIF Thumbnails and the AIR Bus Tour”

  1. Nilesh Says:

    Wow, Air can do that?! grate to know and learn,

    Yet I wonder this works with non JPEG images?

    for example tiff or psd formats

    would AIR handle tiff images without issues?

  2. Kevin Hoyt Says:

    This code is specifically for EXIF data which is included by default in the output of JPEG files as created by most digital cameras. I don’t know what (if any) other format hold this additional data.

  3. Jauder Ho Says:

    It’s also applicable to various RAW file formats (my interest is CR2). The other major chunk of interesting metadata would be IPTC.

    One of the nice things that would be nice to do now that I’ve started geotagging would be the ability to fuzzy fill IPTC location information such as city, state using GPS EXIF as a starting point. This would be of great help in streamlining of workflow.

  4. Chris Cavanagh Says:

    Hi Kevin,

    I gave your code a try in Flex 3. I can grab thumbnails from a couple images, but for most it doesn’t seem to find anything. I’ve tried the same images with JHEAD and it manages to pull a thumbnail out… Any idea what needs tweaking in the code? I’m wondering if there’s another tag we need to process (maybe some weird indirection in the EXIF data).

    Basically the second time it sets ‘entries’, it’s zero. I tried temporarily commenting the ‘idf1′ stuff (to force it to loop all the entries - there were 10) but it didn’t improve things.

    Any help would be greatly appreciated :o)

  5. Todd Hamilton Says:

    I’m having no either, getting this to work with the most common jpeg files that come from most digital cameras. JFIF

    Is this a dead thread or any updates or other ideas if this has been done successfully somewhere?

  6. Kevin Hoyt Says:

    Hey all,

    I’ve made some slight improvements to the source, but nothing substantial enough to release. The source still works with all my Canon DSLR images. I would really like to port an existing library over to AS3, but just don’t have the time. Hopefully this proof of concept shows that it is possible, and the community will pick up the project. For the time being, I’m afraid it has been shelved.

    Thanks for your interest,
    Kevin

  7. Andrey Says:

    Kevin, it seems that you assumed here that App1 immediately follows SOI marker. Since you never know the order of App markers, this may work for some images and fail for all other.

  8. Kevin Hoyt Says:

    Andrey,

    Good catch!

    Thanks,
    Kevin

Leave a Reply