Metadata Exchange Format
Introduction
The metadata exchange format (MEF in short) is a specially designed file format for the purpose of metadata exchange between different platforms. A metadata exported as a MEF can be imported by any platform which is able to understand MEF. This format has been developed with GeoNetwork in mind so the information it contains is mainly related to GeoNetwork. Nevertheless, it can be used as an interoperability format between different platforms.
This format has been designed with the following needs in mind:
- Export a metadata record for backup purposes
- Import a metadata record from a previous backup
- Import a metadata record from a different GeoNetwork version to allow a smooth migration from one version to another.
- Capture metadata plus thumbnails and any data uploaded with the metadata record.
In the paragraphs below, some terms should be intended as follows:
- the term actor is used to indicate any system (application, service etc...) that operates on metadata.
- the term reader will be used to indicate any actor that can import metadata from a MEF file.
- the term writer will be used to indicate any actor that can generate a MEF file.
MEF v1 file format
A MEF file is simply a ZIP file which contains the following files:
Root
|
+--- metadata.xml
+--- info.xml
+--- public
| +---- all public documents and thumbnails
+--- private
+---- all private documents and thumbnails
- metadata.xml: this file contains the metadata itself, in XML format. The text encoding of the metadata (eg. UTF-8) is specified in the XML declaration.
- info.xml: this is a special XML file which contains information related to the metadata (metadata about the metadata). Examples of the information in the info.xml file are: creation date, modification date, privileges This information is needed by GeoNetwork.
- public: this is a directory used to store the metadata thumbnails and other public files. There are no restrictions on the image format but it is strongly recommended to use the portable network graphics (PNG), JPEG or GIF format.
- private: this is a directory used to store all data (maps, shape files etc...) uploaded with the metadata in the GeoNetwork editor. Files in this directory are private in the sense that authorisation is required to access them. There are no restrictions on the file types that can be stored into this directory.
Any other file or directory present in the MEF file should be ignored by readers that don't recognise them. This allows actors to add custom extensions to the MEF file.
A MEF file can have empty public and private folders depending upon the export format, which can be:
- simple: both public and private are omitted.
- partial: only public files are provided.
- full: both public and private files are provided.
It is recommended to use the .mef extension when naming MEF files.
MEF v2 file format
MEF version 2 support the following:
- multi-metadata support: more than one metadata record and data can be stored in a single MEF file.
- multi-schema support: be able to store in a single MEF n formats (eg. for an ISO profile, also store a version of that record in the base ISO19115/ISO19139 schema).
Current export services that create MEF files from a metadata record with related records (eg. paent, feature catalog etc), can include these related metadata records in the MEF. See mef.export.
MEF v2 format structure is the following:
Root
|
+ 0..n metadata
|
+--- metadata
| +--- metadata.xml
| +--- (optional) metadata.iso19139.xml
+--- info.xml
+--- applschema
| +--- (optional) metadata.xml (ISO19110 Feature Catalog)
+--- public
| +---- all public documents and thumbnails
+--- private
+---- all private documents and thumbnails
Note: metadata.iso19139.xml is generated by GeoNetwork actors on export if the metadata record in metadata.xml is an ISO19115/19139 profile. On import, this record may be selected for loading if the ISO19115/19139 profile is not present.
The info.xml file
This file contains general information about a metadata. It must have an info root element with a mandatory version attribute. This attribute must be in the X.Y form, where X represents the major version and Y the minor one. The purpose of this attribute is to allow future changes of this format maintaining compatibility with older readers. The policy behind the version is this:
- A change to Y means a minor change. All existing elements in the previous version must be left unchanged: only new elements or attributes may be added. A reader capable of reading version X.Y is also capable of reading version X.Y' with Y'>Y.
- A change to X means a major change. Usually, a reader of version X.Y is not able to read version X'.Y with X'>X.
The root element must have the following children:
- general: a container for general information. It must have the following children:
- uuid: this is the universally unique identifier assigned to the metadata and must be a valid UUID. This element is optional and, when omitted, the reader should generate one. A metadata without a UUID can be imported several times into the same system without breaking uniqueness constraints. When missing, the reader should also generate the siteId value.
- createDate: This date indicates when the metadata was created.
- changeDate: This date keeps track of the most recent change to the metadata.
- siteId: This is an UUID that identifies the actor that created the metadata and must be a valid UUID. When the UUID element is missing, this element should be missing too. If present, it will be ignored.
- siteName: This is a human readable name for the actor that created the metadata. It must be present only if the siteId is present.
- schema: The name of the schema for the metadata record in metadata.xml. When the MEF is imported by a GeoNetwork actor, this name should be the name of a metadata schema handled by the actor (eg. iso19139). If the GeoNetwork actor does not have such a schema, it may try and select another metadata with a schema that is present (eg. the metadata in metadata-iso19139.xml could be loaded because the iso19139 schema is present).
- format: Indicates the MEF export format. The element's value must belong to the following set: { simple, partial, full }.
- localId: This is an optional element. If present, indicates the id used locally by the sourceId actor to store the metadata. Its purpose is just to allow the reuse of the same local id when reimporting a metadata.
- isTemplate: A boolean field that indicates if this metadata is a template used to create new ones. There is no real distinction between a real metadata and a template but some actors use it to allow fast metadata creation. The value must be: { true, false }.
- rating: This is an optional element. If present, indicates the users' rating of the metadata ranging from 1 (a bad rating) to 5 (an excellent rating). The special value 0 means that the metadata has not been rated yet. Can be used to sort search results.
- popularity: Another optional value. If present, indicates the popularity of the metadata. The value must be positive and high values mean high popularity. The criteria used to set the popularity is left to the writer. Its main purpose is to provide a metadata ordering during a search.
- categories: a container for categories associated to this metadata. A category is just a name, like 'audio-video' that classifies the metadata to allow an easy search. Each category is specified by a category element which must have a name attribute. This attribute is used to store the category's name. If there are no categories, the categories element will be empty.
- privileges: a container for privileges associated to this metadata. Privileges are operations that a group (which represents a set of users) can do on a metadata and are specified by a set of group elements. Each one of these, has a mandatory name attribute to store the group's name and a set of operation elements used to store the operations allowed on the metadata. Each operation element must have a name attribute which value must belong to the following set: { view, download, notify, dynamic, featured }. If there are no groups or the actor does not have the concept of group, the privileges element will be empty. A group element without any operation element must be ignored by readers.
- public: All metadata thumbnails (and any other public file) must be listed here. This container contains a file element for each file. Mandatory attributes of this element are name, which represents the file's name and changeDate, which contains the date of the latest change to the file. The public element is optional but, if present, must contain all the files present in the metadata's public directory and any reader that imports these files must set the latest change date on these using the provided ones. The purpose of this element is to provide more information in the case the MEF format is used for metadata harvesting.
- private: This element has the same purpose and structure of the public element but is related to maps and all other private files.
Any other element or attribute should be ignored by readers that don't understand them. This allows actors to add custom attributes or subtrees to the XML.
Date format
Unless differently specified, all dates in this file must be in the ISO/8601 format. The pattern must be YYYY-MM-DDTHH:mm:SS
and the timezone should be the local one.
Date format example
Example of info file:
<info version="1.0">
<general>
<uuid>0619abc0-708b-eeda-8202-000d98959033</uuid>
<createDate>2006-12-11T10:33:21</createDate>
<changeDate>2006-12-14T08:44:43</changeDate>
<siteId>0619cc50-708b-11da-8202-000d9335906e</siteId>
<siteName>FAO main site</siteName>
<schema>iso19139</schema>
<format>full</format>
<localId>204</localId>
<isTemplate>false</isTemplate>
</general>
<categories>
<category name="maps"/>
<category name="datasets"/>
</categories>
<privileges>
<group name="editors">
<operation name="view"/>
<operation name="download"/>
</group>
</privileges>
<public>
<file name="small.png" changeDate="2006-10-07T13:44:32"/>
<file name="large.png" changeDate="2006-11-11T09:33:21"/>
</public>
<private>
<file name="map.zip" changeDate="2006-11-12T13:23:01"/>
</private>
</info>