The first photo that came down to us was black and white and blurred. Then sharpness came into the picture. Later - color. One more step forward is a number. The popularity and distribution of "photography" is constantly growing and growing. Now the cats make a selfie. What's next? And then (or rather, right now) digital images, which, in addition to millions of colored dots, store information about the depth of the space captured on them.

This opens up
tremendous opportunities. Among them are motion effects, such as parallax and “hitting-departure”. In the “depths” of images, new approaches to art filters, to setting sharpness, to editing images, to measuring by photo are lurking. And this is just the beginning.
Today we will talk about the JavaScript implementation of the parser of photos with support for depth. It works with
eXtensible Device Metadata (XDM) image files, extracting embedded metadata from them and storing the received materials as XML files. In addition, the program can extract color and depth information from XML. As a result, the output is XML files, color images, and depth map files.
')
If you’re
looking forward to trying what’s discussed below, take a look at
Depthy , our open source project. It involves the described photo analyzer. Practice has shown that he does his work efficiently and quickly.
Before considering the code, we will focus on the XDM format.
XDM format
At the entrance of the script served XDM-files. In the XDM format, metadata is stored in container images, and the images are compatible with existing graphics viewing applications. This format is designed for
Intel RealSense technology. Metadata contains technical information. Namely, it is the depth map, the spatial position of the device and the camera, the model of the lens perspective, information on the equipment manufacturer, the cloud of points. Here's what the color image looks like (on the right) and the corresponding depth map in XDM format (left), which is stored in the image file as metadata.
Color image and its depth mapData format XDM need to somehow integrate into the file container. To do this, use the standard Adobe XMP.
Adobe XMP standard
Now the XDM specification provides for the use of graphic container files of four formats: JPEG, PNG, TIFF and GIF. XDM metadata is serialized and embedded in a graphic container file. Metadata storage is based on the
Adobe Extensible Metadata Platform (XMP) standard. The application considered here is designed for using containers in JPEG format. Let us briefly discuss how XMP metadata is embedded in JPEG files, and how the program processes XMP packets.
In the XMP standard, data fragments are marked with 2-byte sequences. Type markers 0xFFE0–0xFFEF are usually used for these applications. Their names are APP
n . Such markers are usually started from the line describing their purpose. This is the so-called namespace string or signature string. The APP
1 token identifies Exif and TIFF metadata. Code APP
13 mark data format Photoshop Image Resources. They contain IPTC metadata. The location of the XMP packet or packets is indicated by one or more APP
1 -markers.
Here’s what a StandardXMP format entry looks like in a JPEG file.
StandardXMP format entry fieldsOffset byte
| Length, byte
| Value
| Name
| Comments
|
0
| 2
| 0xFFE1
| APP1
| The APP1 marker points to the metadata section
|
2
| 2
| 2 + 29 + XMP packet length
| Lp
| Size in bytes equal to the sum of the sizes of this section and the following two
|
four
| 29
| ASCII string without quotes, ending with a null character
| namespace
| The XMP namespace URI is used as a unique identifier: ns.adobe.com/xap/1.0
|
33
| <65503
| XMP packet
| | UTF-8 encoding required.
|
If after serialization, the size of the XMP packet is larger than 64 KB, it can be divided into parts and these parts can be saved in several places in the JPEG file. Namely, with this approach, the packet data will be represented by the main (StandardXMP) and extended (ExtendedXMP) segments. ExtendedXMP uses the same recording format as StandardXMP. The only exception - in the field that stores information about the namespace (namespace), indicated
http://ns.adobe.com/xmp/extension/ .
This is how the XMP packet data is embedded in the JPEG file as StandardXMP and ExtendedXMP format records.
Records of the formats StandardXMP and ExtendedXMP in a JPEG fileConsider three functions.
- The findMarker function analyzes a JPEG file in the search for a 0xFFE1 marker, starting at a given position. The file content is represented by the parameter of the function buffer, position by the parameter position. If the token is found, the function will return its address; if not found, the value is -1.
- The findHeader function searches for the StandardXMP (http://ns.adobe.com/xap/1.0/) and ExtendedXMP (http://ns.adobe.com/xmp/extension/) namespaces in a JPEG file. It is passed, again, the buffer with the file data (buffer) and the position from which to start the search (position). If a match is found, the function will return a string corresponding to the detected namespace. If not, an empty string will be returned.
- The findGUID function searches for a GUID that is stored in the xmpNote: HasExtendedXMP element in a JPEG file (buffer parameter), starting with the position in the file (position) and ending with the position in the file, calculated as position + zize-1. Finding the desired item, it returns its address.
Here is the code for these functions.
// (buffer), 0xFFE1, (position) // -1, function findMarker(buffer, position) { var index; for (index = position; index < buffer.length; index++) { if ((buffer[index] == marker1) && (buffer[index + 1] == marker2)) return index; } return -1; } // , , – , . function findHeader(buffer, position) { var string1 = buffer.toString('ascii', position + 4, position + 4 + header1.length); var string2 = buffer.toString('ascii', position + 4, position + 4 + header2.length); if (string1 == header1) return header1; else if (string2 == header2) return header2; else return noHeader; } // GUID function findGUID(buffer, position, size) { var string = buffer.toString('ascii', position, position + size - 1); var xmpNoteString = "xmpNote:HasExtendedXMP="; var GUIDPosition = string.search(xmpNoteString); var returnPos = GUIDPosition + position + xmpNoteString.length + 1; return returnPos; }
The 128-bit GUID is stored as a 32-byte hexadecimal ASCII string in each ExtendedXMP segment, behind the namespace. It is also stored in the StandardXMP segment, as the value of the xmpNote: HasExtendedXMP property. Due to this, we can detect inappropriate or modified ExtendedXMP-segments.
XML
XMP format metadata can be embedded directly into
XML documents . In accordance with the XDM specification, the XML data structure can be defined as shown in the table.
XML representation of XMP data
The graphic file contains the elements described above in RDF / XML format. It should be noted that the image container is an external object with respect to the XDM data. It remains compatible with conventional graphics viewing applications that do not support XDM.
Here is a snippet of code that demonstrates the core of the parser. This is where the analysis of the input JPEG file, the search for APP
1- marker 0xFFE1 is carried out. If the token is found, it searches for string representations of the StandardXMP and ExtendedXMP namespaces. If the first is found, the size of the metadata and their starting address are calculated, the data is extracted and the StandardXMP XML file is created. If the second is found, the procedure is repeated, but the XML file is ExtendedXMP. The output of the application are two XML files.
// XDM- function xdmParser(xdmFilePath) { try { // JPEG- var fileStats = fs.statSync(xdmFilePath); var fileSizeInBytes = fileStats["size"]; var fileBuffer = new Buffer(fileSizeInBytes); // JPEG- var xdmFileFD = fs.openSync(xdmFilePath, 'r'); // JPEG- fs.readSync(xdmFileFD, fileBuffer, 0, fileSizeInBytes, 0); var bufferIndex, segIndex = 0, segDataTotalLength = 0, XMLTotalLength = 0; for (bufferIndex = 0; bufferIndex < fileBuffer.length; bufferIndex++) { var markerIndex = findMarker(fileBuffer, bufferIndex); if (markerIndex != -1) { // 0xFFE1 var segHeader = findHeader(fileBuffer, markerIndex); if (segHeader) { // // , , // segIndex 0, 1 var segSize = fileBuffer[markerIndex + 2] * 16 * 16 + fileBuffer[markerIndex + 3]; var segDataStart; // 2-->segSize 2- // 1--> 0 , segSize -= (segHeader.length + 2 + 1); // 2-->0xFFE1 2- // 2-->segSize 2 // 1--> 0 , segDataStart = markerIndex + segHeader.length + 2 + 2 + 1; if (segHeader == header1) { // StandardXMP var GUIDPos = findGUID(fileBuffer, segDataStart, segSize); var GUID = fileBuffer.toString('ascii', GUIDPos, GUIDPos + 32); var segData_xap = new Buffer(segSize - 54); fileBuffer.copy(segData_xap, 0, segDataStart + 54, segDataStart + segSize); fs.appendFileSync(outputXAPFile, segData_xap); } else if (segHeader == header2) { // ExtendedXMP var segData = new Buffer(segSize - 40); fileBuffer.copy(segData, 0, segDataStart + 40, segDataStart + segSize); XMLTotalLength += (segSize - 40); fs.appendFileSync(outputXMPFile, segData); } bufferIndex = markerIndex + segSize; segIndex++; segDataTotalLength += segSize; } } else { // , break; }; } } catch(ex) { console.log("Something bad happened! " + ex); } }
Here is a code snippet that analyzes the XML file and generates a color image and its depth map. Then this data can be used to process photos with depth support. Everything is very simple here. The xmpMetadataParser () function searches for the
IMAGE: DATA attribute and extracts the corresponding data in a JPEG file. It turns out a color image. If several such attributes are found, several JPEG files will be created. In addition, the function searches for the
DEPTHMAP: DATA attribute and extracts the corresponding data into a PNG file. This is the depth map. If several such attributes are found, respectively, several PNG files are created. At the output we get one or several JPEG and PNG files.
// XMP- , function xmpMetadataParser() { var imageIndex = 0, depthImageIndex = 0, outputPath = ""; parser = sax.parser(); // , parser.onattribute = function (attr) { if ((attr.name == "IMAGE:DATA") || (attr.name == "GIMAGE:DATA")) { outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_" + imageIndex + ".jpg"; var atob = require('atob'), b64 = attr.value, bin = atob(b64); fs.writeFileSync(outputPath, bin, 'binary'); imageIndex++; } else if ((attr.name == "DEPTHMAP:DATA") || (attr.name == "GDEPTH:DATA")) { outputPath = inputJpgFile.substring(0, inputJpgFile.length - 4) + "_depth_" + depthImageIndex + ".png"; var atob = require('atob'), b64 = attr.value, bin = atob(b64); fs.writeFileSync(outputPath, bin, 'binary'); depthImageIndex++; } }; parser.onend = function () { console.log("All done!") } } // XMP- function processXmpData(filePath) { try { var file_buf = fs.readFileSync(filePath); parser.write(file_buf.toString('utf8')).close(); } catch (ex) { console.log("Something bad happened! " + ex); } }
Results
So, XDM files are parsed, turned into JPEG and PNG, into color images and depth maps. All this is done exclusively by means of our
script , without attracting additional libraries. Would you like to incorporate photo processing tools with depth support into your web project? The JavaScript parser we talked about can be the foundation on which such tools can be built.
PS Do you write in Java and want to process depth-supporting photos in your projects? If so, then you are
here .