Using JPEG Exif information as a starting point to play with binary data

table of contents
Hello,
I'm Mandai, the Wild Team member of the development team.
The other day, the strange phenomenon of images on my smartphone pointing in the wrong direction , I did some research on Exif information. In the process, I was able to read the binary version of Exif information, so I would like to explain the structure of Exif information.
also hope that through this process, people will realize that
is quite easy to manipulate
What is used for Exif analysis?
The tools used to analyze Exif are available on the Internet.
You can read Exif information with the following three items:
First, I looked up the standard information.
the materials from the Japan Electronics and Information Technology Industries Association , I would have been fine, but it was quite a difficult book to understand, so I used Excel to check each point one by one.
To view the binary data, I used software called
Stirling It appears that maintenance for this software has already ended, but it is so well-made that I have yet to come across a better tool in this field.
In a Linux environment, there is a command called od that displays binary data, so you can see something similar by using od -x [path].
In this case, the dump list will scroll by in a long line, so you will need to use the head, tail, and grep commands as appropriate to display only the parts you want to see.
Image formats that use Exif include JPEG, TIFF, and JPEG XR.
Since Exif is often registered in these files, samples are readily available. Let
's import a random photo taken with your smartphone.
Deciphering binary information
Binary data is not text information, but a collection of data that conforms to the format indicated by the extension, so from the very beginning, the data is arranged in a way that conforms to the JPEG data format
I extracted the first part of a JPEG file I had on hand
FF D8 FF E1 6C 0F 45 78 69 66 00 00 49 49
Every JPEG file begins with two bytes of data: "FF D8."
This is a special area known in the JPEG industry as SOI, and the rest of the area stores image metadata.
Metadata is data stored in addition to image data, including Exif information.
This data is divided into required items and other data that can be added independently by the device that took the photo or the application that edited it.
The following data is "FF E1", which indicates the start of a metadata segment called the APP1 application segment.
Because this is variable-length data, the following two bytes store the data length of the APP1 application segment.
The data length is "6C0F" bytes, so the data in the APP1 application segment extends up to address 6C13.
The next data is "45 78 69 66 00 00", which indicates whether or not there is Exif information.
This time, the data appears to be present.
This is all the information related to the entire APP1 segment
The information from here on is supplementary information and is not required
Let's extract the following data
49 49 2A 00 08 00 00 00
First, there's the TIFF header, which indicates the byte order, TIFF version, and offset information to the start of the 0th IFD.
In this case, the byte order is "49 49," indicating that the data is arranged in little endian.
For big endian, it would be "4D 4D."
From here on, the order of each piece of information is reversed depending on whether it's little endian or big endian. If you have a photo taken with an Apple product, it's big endian, so be careful.
Wikipedia provides a detailed
explanation of the term byte order is a fairly important position, as it serves as the reference point for the offset of the data addresses that follow
Next is the TIFF version, "2A 00"
Next comes the 0th IFD. IFD stands for Image File Directory and represents a collection of additional information.
It can be thought of as a collection of several tags within a single IFD.
IFDs are divided according to the type of data.
The data in the IFD is divided into tags, each of which has a fixed length of 12 bytes.
The data structure of the IFD is as follows:
| 0th IFD | Number of tags | 2 bytes |
| Tag 1 | 12 bytes | |
| Tag 2 | 12 bytes | |
| Tag 3 | 12 bytes | |
| ・・・ | ||
| Tag N | 12 bytes | |
| Offset value to 1st IFD | 4 bytes | |
| Value in 0th IFD | variable | |
- "Number of tags" indicates how many tags are in the 0th IFD
- "Offset value to 1st IFD" is the offset value to the address where the following IFD (1st IFD for 0th IFD, 2nd IFD for 1st IFD) starts. (If this is 0, it means there is no next IFD.)
- "Value in 0th IFD" contains data that does not fit within the number of bytes allocated to each tag. (The structure will be explained later.)
The structure of each tag is as follows:
| 0th IFD | Tag 1 | Tag Number | 2 bytes |
| type | 2 bytes | ||
| count | 4 bytes | ||
| Offset to the value | 4 bytes | ||
| Tag 2 | Tag Number | 2 bytes | |
| type | 2 bytes | ||
| count | 4 bytes | ||
| Offset to the value | 4 bytes |
- There are so many different types of "tag numbers" that it's impossible to list them all. the specifications starting from here, but since that seems like a lot of work, I'll introduce the ones related to obtaining Exif data one by one.
- "Type" is a data type. It's familiar to anyone who has experience programming in a typed language. There are only eight types, which we'll introduce later
- "Count" indicates how many pieces of data are contained in the tag
- If the data is larger than 4 bytes, the "offset value to the value" will not fit in this area, so a separate variable area is allocated to store the value, and so it becomes the offset value to the location of that data. There is an exception; data that is 4 bytes or less will be stored in the area reserved for this offset value
I have listed the types below.
The counting method differs depending on the type, so I have listed them here as well.
| Type Value | Type | explanation | How to count |
| 1 | BYTE | 8-bit unsigned integer | |
| 2 | ASCII | A string. The number of characters includes the terminating NULL | In the case of "BEYOND", "B", "E", "Y", "O", "N", "D", "\0" → 7 counts |
| 3 | SHORT | 16-bit (2-byte) unsigned integer | 5 counts as 1 |
| 4 | LONG | 32-bit (4-byte) unsigned integer | 5 counts as 1 |
| 5 | RATIONAL | For two LONGs, the first LONG is the numerator and the second LONG is the denominator | 1 for 5/4 |
| 7 | UNDEFINED | 8-bit data that can take any value depending on the field definition | 0xFF, 0x01, 0x45, 0x11, 0xDD: 5 counts (8 bits = 1 count) |
| 9 | SLONG | 32-bit (4-byte) signed integer | 5 counts as 1 |
| 10 | SRATIONAL | In SLONG2 classical Japanese, the first SLONG is the numerator and the second SLONG is the denominator | 5/4 counts as 1 |
Just looking at the table won't give you an idea, so let's take a look at the data one by one
Read the IFD
I extracted the 0th IFD part from the actual photo data
0B 00 0F 01 02 00 05 00 00 00 92 00 00 00 10 01 02 00 07 00 00 00 98 00 00 00 12 01 03 00 01 00 00 00 01 00 00 00 1A 01 05 00 01 00 00 00 A0 00 00 00 1B 01 05 00 01 00 00 00 A8 00 00 00 28 01 03 00 01 00 00 00 02 00 00 00 31 01 02 00 14 00 00 00 B0 00 00 00 32 01 02 00 14 00 00 00 C4 00 00 00 13 02 03 00 01 00 00 00 01 00 00 00 69 87 04 00 01 00 00 00 D8 00 00 00 25 88 04 00 01 00 00 00 2C 52 32 52 00 00
this table as a reference, first extract the first two bytes.
The "0B 00" part indicates how many tags are contained in the 0th IFD.
The byte order of this data is little endian, so when read by a human, it becomes 0x000B,
which is 11.
This means that the 0th IFD contains 11 tags.
One tag is a block of 12 bytes, so the tag data is 12*11=132 bytes
Let's extract the tags one by one
0F 01 02 00 05 00 00 00 92 00 00 00 10 01 02 00 07 00 00 00 98 00 00 00 12 01 03 00 01 00 00 00 01 00 00 00 1A 01 05 00 01 00 00 00 A0 00 00 00 1B 01 05 00 01 00 00 00 A8 00 00 00 28 01 03 00 01 00 00 00 02 00 00 00 31 01 02 00 14 00 00 00 B0 00 00 00 32 01 02 00 14 00 00 00 C4 00 00 00 13 02 03 00 01 00 00 00 01 00 00 00 69 87 04 00 01 00 00 00 D8 00 00 00 25 88 04 00 01 00 00 00 2C 52 00 00
I'll break down the tags into elements and create a table.
also look up what the tag numbers represent the specifications and add them.
| Tag number/tag name | type | count | An offset to the value |
| 0F 01 Manufacturer |
02 00 ASCII |
05 00 00 00 Count 5 |
92 00 00 00 |
| 10 01 Model |
02 00 ASCII |
07 00 00 00 Count 7 |
98 00 00 00 |
| 12 01 Image Orientation |
03 00 SHORT |
01 00 00 00 Count 1 |
01 00 00 00 |
| 1A 01 Image width resolution |
05 00 RATIONAL |
01 00 00 00 Count 1 |
A0 00 00 00 |
| 1B 01 Image height resolution |
05 00 RATIONAL |
01 00 00 00 Count 1 |
A8 00 00 00 |
| 28 01 Image width and height resolution units |
03 00 SHORT |
01 00 00 00 Count 1 |
02 00 00 00 |
| 31 01 software |
02 00 ASCII |
14 00 00 00 Count 20 |
B0 00 00 00 |
| 32 01 File modification time |
02 00 ASCII |
14 00 00 00 Count 20 |
C4 00 00 00 |
| 13 02 YCC pixel configuration (position of Y and C) |
03 00 SHORT |
01 00 00 00 Count 1 |
01 00 00 00 |
| 69 87 A pointer to the Exif IFD |
04 00 LONG |
01 00 00 00 Count 1 |
D8 00 00 00 |
| 25 88 A pointer to the GPS IFD |
04 00 LONG |
01 00 00 00 Count 1 |
2C 52 00 00 |
There is a pattern for how the actual value is retrieved; if the type is BYTE, SHORT, LONG, or SLONG, and the count is 1, a value is stored in the offset part.
For ASCII and UNDEFINED, if the count is 4 or less, a value is stored in the offset part.
Otherwise, this and the data is retrieved for the type and count. The
retrieved result will be as follows:
| Tag Name | value |
| Manufacturer | 53 6F 6E 79 00 Sony |
| Model | 53 4F 2D 30 34 48 00 SO-04H |
| Image Orientation | 1 |
| Image width resolution | 48 00 00 00 01 00 00 00 72/1 |
| Image height resolution | 48 00 00 00 01 00 00 00 72/1 |
| Image width and height resolution units | 2 |
| software | 33 35 2E 30 2E 42 2E 32 2E 32 37 32 5F 30 5F 66 36 30 30 00 35.0.B.2.272_0_f600 |
| File modification time | 32 30 31 36 3A 31 30 3A 32 31 20 31 35 3A 32 30 3A 35 34 00 2016:10:21 15:20:54 |
| YCC pixel configuration (position of Y and C) | 1 |
| A pointer to the Exif IFD | D8 00 00 00 |
| A pointer to the GPS IFD | 2C 52 00 00 |
| Offset value to 1st IFD | 32 52 00 00 |
So my phone model was revealed, but it shows how easy it is to extract this much information from just a JPEG image
Now that we know where the Exif IFD is located, let's take a look at the main subject, the Exif information
Read the Exif information
The procedure is exactly the same as for the 0th IFD. Let's extract the binary data from the Exif information section
1C 00 9A 82 05 00 01 00 00 00 2E 02 00 00 9D 82 05 00 01 00 00 00 36 02 00 00 27 88 03 00 01 00 00 00 A0 00 00 00 00 90 07 00 04 00 00 00 30 32 32 30 03 90 02 00 14 00 00 00 3E 02 00 00 04 90 02 00 14 00 00 00 52 02 00 00 01 91 07 00 04 00 00 00 01 02 03 00 01 92 0A 00 01 00 00 00 66 02 00 00 04 92 0A 00 01 00 00 00 6E 02 00 00 07 92 03 00 01 00 00 00 05 00 00 00 08 92 03 00 01 00 00 00 00 00 00 00 09 92 03 00 01 00 00 00 10 00 00 00 0A 92 05 00 01 00 00 00 76 02 00 00 7C 92 07 00 70 4F 00 00 7E 02 00 00 90 92 02 00 07 00 00 00 EE 51 00 00 91 92 02 00 07 00 00 00 F6 51 00 00 92 92 02 00 07 00 00 00 FE 51 00 00 00 A0 07 00 04 00 00 00 30 31 30 30 01 A0 03 00 01 00 00 00 01 00 00 00 02 A0 04 00 01 00 00 00 60 17 00 00 03 A0 04 00 01 00 00 00 26 0D 00 00 05 A0 04 00 01 00 00 00 0E 52 00 00 01 A4 03 00 01 00 00 00 00 00 00 00 02 A4 03 00 01 00 00 00 00 00 00 00 03 A4 03 00 01 00 00 00 00 00 00 00 04 A4 05 00 01 00 00 00 06 52 00 00 06 A4 03 00 01 00 00 00 00 00 00 00 0C A4 03 00 01 00 00 00 00 00 00 00 00 00 00 00
| Tag Number | type | count | Offset value to value / actual data |
| 9A 82 exposure time |
05 00 RATIONAL |
01 00 00 00 Count 1 |
2E 02 00 00 0A 00 00 00 40 01 00 00 (10/320) |
| 9D 82 F-number |
05 00 RATIONAL |
01 00 00 00 Count 1 |
36 02 00 00 14 00 00 00 0A 00 00 00 (20/10) |
| 27 88 Shooting sensitivity |
03 00 SHORT |
01 00 00 00 Count 1 |
A0 00 00 00 10 |
| 00 90 Exif version |
07 00 UNDEFINED |
04 00 00 00 Count 4 |
30 32 32 30 0220 |
| 03 90 Date and time the original image data was created |
02 00 ASCII |
14 00 00 00 Count 20 |
3E 02 00 00 32 30 31 36 3A 31 30 3A 32 31 20 31 35 3A 32 30 3A 35 34 (2016:10:21 15:20:54) |
| 04 90 Date and time of digital data creation |
02 00 ASCII |
14 00 00 00 Count 20 |
52 02 00 00 32 30 31 36 3A 31 30 3A 32 31 20 31 35 3A 32 30 3A 35 34 (2016:10:21 15:20:54) |
| 01 91 What each component means |
07 00 UNDEFINED |
04 00 00 00 Count 4 |
01 02 03 00 Other (Y, Cb, Cr) |
| 01 92 Shutter speed |
0A 00 SRATIONAL |
01 00 00 00 Count 1 |
66 02 00 00 F4 01 00 00 64 00 00 00 (500/100) |
| 04 92 Exposure Correction Value |
0A 00 SRATIONAL |
01 00 00 00 Count 1 |
6E 02 00 00 00 00 00 00 03 00 00 00(0/3) |
| 07 92 Photometry method |
03 00 SHORT |
01 00 00 00 Count 1 |
05 00 00 00 5 (split metering) |
| 08 92 light source |
03 00 SHORT |
01 00 00 00 Count 1 |
00 00 00 00 0 (Unknown) |
| 09 92 flash |
03 00 SHORT |
01 00 00 00 Count 1 |
10 00 00 00 Strobe light |
| 0A 92 lens focal length |
05 00 RATIONAL |
01 00 00 00 Count 1 |
76 02 00 00 A7 01 00 00 64 00 00 00 (423/100) |
| 7C 92 Makernote |
07 00 UNDEFINED |
70 4F 00 00 Count 20336 |
7E 02 00 00 Omitted due to large amount of data |
| 90 92 DateTime subseconds |
02 00 ASCII |
07 00 00 00 Count 7 |
EE 51 00 00 36 38 34 37 37 32(684772) |
| 91 92 Subseconds of DateTimeOriginal |
02 00 ASCII |
07 00 00 00 Count 7 |
F6 51 00 00 36 38 34 37 37 32(684772) |
| 92 92 Subseconds of DateTimeDigitized |
02 00 ASCII |
07 00 00 00 Count 7 |
FE 51 00 00 36 38 34 37 37 32(684772) |
| 00 A0 compatible flash picks version |
07 00 UNDEFINED |
04 00 00 00 Count 4 |
30 31 30 30 0100 (Flashpix Format Version 1.0) |
| 01 A0 color space information |
03 00 SHORT |
01 00 00 00 Count 1 |
01 00 00 00 sRGB |
| 02 A0 effective image width |
04 00 LONG |
01 00 00 00 Count 1 |
60 17 00 00 5984 |
| 03 A0 effective image height |
04 00 LONG |
01 00 00 00 Count 1 |
26 0D 00 00 3366 |
| 05 A0 compatibility IFD pointer |
04 00 LONG |
01 00 00 00 Count 1 |
0E 52 00 00 0E 52 00 00 |
| 01 A4 individual image processing |
03 00 SHORT |
01 00 00 00 Count 1 |
00 00 00 00 0 |
| 02 A4 exposure mode |
03 00 SHORT |
01 00 00 00 Count 1 |
00 00 00 00 0 (auto exposure) |
| 03 A4 White Balance |
03 00 SHORT |
01 00 00 00 Count 1 |
00 00 00 00 0 (auto white balance) |
| 04 A4 digital zoom magnification |
05 00 RATIONAL |
01 00 00 00 Count 1 |
06 52 00 00 64 00 00 00 64 00 00 00(100/100) |
| 06 A4 Shooting Scene Type |
03 00 SHORT |
01 00 00 00 Count 1 |
00 00 00 00 0 (standard) |
| 0C A4 subject distance range |
03 00 SHORT |
01 00 00 00 Count 1 |
00 00 00 00 0 (Unknown) |
This is followed by GPS information, which means you can immediately see where the photo was taken.
For photos taken while traveling, it would be interesting to display the photos on a map, but it would be a problem if you did that for photos taken at home.
I felt a great sense of accomplishment in successfully reading the Exif information, but in PHP, the above data can be easily obtained using the read_exif_data function.
I hope this article will be helpful if you only have a binary editor and Exif documents on hand.
That's all
19