In this article I want to share my findings on how to interpret the ZMAP file format.
What exactly is ZMAP+ plus file format
The ZMapPlus is an old format used to store gridded data in an ASCII line format for transport and storage. It is commonly used in applications in the Oil and Gas Exploration field’s applications read and write the format.
It is for gridded data. The format can also support point and polygon data, but only one data type is allowed in each file. There are a specific set of header rows that define how the data is written in the file and actual data followed by the header.
Below is sample zmap+ file.
! ! File created by DMBTools2.GridFileFormats.ZmapPlusFile ! @GRID FILE, GRID, 4 20, -9999.0000000, , 7, 1 6, 4, 0, 200, 0, 300 0.0, 0.0, 0.0 @ -9999.0000000 -9999.0000000 3.0000000 32.0000000 88.0000000 13.0000000 -9999.0000000 20.0000000 8.0000000 42.0000000 75.0000000 5.0000000 5.0000000 100.0000000 35.0000000 50.0000000 27.0000000 1.0000000 2.0000000 36.0000000 10.0000000 6.0000000 9.0000000 -9999.0000000
Comment in file denoted by “!” at the start of line. Hence If the first character is a “!” the line is a comment.
There are two sections in ZMAP+, header section and data section. Let us decode them one by one.
Decoding the header
The header section is starts with the first line that has an “@” symbol. The data starts on the first line after the last “@” symbol, and there may only be two.
@GRID FILE, GRID, 4 20, -9999.0000000, , 7, 1 6, 4, 0, 200, 0, 300 0.0, 0.0, 0.0 @
This is header from above sample. Header fields are comma delimited. Let see the fields line by line.
On line number 1, there are three fields:
- The first is user defined but many times is just “GRID FILE”.
- The second, for a grid file, must be “GRID”.
- The third is an integer that indicates the number of grid nodes per physical line.
On line number 2, there are five fields:
- The first field is the field width of each grid node as stored in the data section below the last “@” (below the header part)
- The second field is the missing or null data value as it will be found in the data section.
- The third field is a user defined text value used to indicate a missing or null value. Which is blank in our case. This value will get use only if field number 2 is blank/missing.
- The forth field indicates the number of decimal places to use if no decimal point is found in the data nodes.
- The fifth field indicates the starting column of the first grid node on each line in the data section of the file.
On line number 3, there are six fields:
- The first field is the number of rows in the grid.
- The second field is the number of columns in the grid. (Hence total number of values in data section must be equal to rows * column)
- The third is the minimum grid X node value. (x-min)
- The forth is the maximum grid X node value. (x-max)
- The fifth is the minimum grid Y node value. (y-min)
- The sixth is the maximum grid Y node value. (y-max)
On line number 4, there are three fields and they are always “0,0, 0.0, 0.0”. (I don’t know the real reason for this, but if you do then please put that in comment section below)
Decoding the data section
@ -9999.0000000 -9999.0000000 3.0000000 32.0000000 88.0000000 13.0000000 -9999.0000000 20.0000000 8.0000000 42.0000000 75.0000000 5.0000000 5.0000000 100.0000000 35.0000000 50.0000000 27.0000000 1.0000000 2.0000000 36.0000000 10.0000000 6.0000000 9.0000000 -9999.0000000
After the last header line, there is a single line with a single “@”, the line after which is the first line of data in the data section of the file. The data section has fixed field widths and each field being a single grid node, and is generally right justified.
There will be no more nodes on any physical line than that defined in third field of the first header line. (Line number 1, Field number 3)
A data field may or may not have a decimal point. If none is found, it is implied, and the decimal places are as defined in the second header line in the forth field. (Line number 2, Field number 4)
The grid nodes in the data section are stored in column major order. That is the first column of data is written first, starting at the upper left corner of the grid.
For example, if the grid has 7 rows and three columns, and the number of nodes per line is 4, the first line of the data section will have 4 nodes, the first four grid nodes going down from the upper left. The second line will have three nodes, the last three nodes of the first column. Then the next column is written, four nodes then three. Then the last column is written in the same pattern.
I had to struggle a lot to wrap my head around this file format so I thought I should put all this in single post to share my findings. I hope that this article has helped you to familiarize yourself with the ZMAP+ file format. Criticism is always welcome!
2020-07-10 00:00 +0530