4. Preparing Discovery Metadata

This Section gives a full exposition on the application of the Guidelines. For each metadata element an example of its application is given together with additional comments and/or explanation where necessary. The examples are drawn from various datasets to illustrate each distinct metadata element and therefore do not form a coherent whole. Worked examples of complete metadata sets are included in Appendix A.

It is strongly recommended that an entry be provided for all discovery metadata elements. If a particular piece of information is not known or not available, saying so is more valuable than leaving the entry empty. Leaving the entry empty could mean that the data provider had just neglected to provide relevant information.

4.1 Title

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Title

1

Name by which dataset is known

M

1

Text

Free text

             

4.1.1 Example

Element Name

Identifier

Entry

     

Title

1

Lexicon of Named Rock Unit Definitions

4.1.2 Comments

This element is used to provide the title of the dataset, that is, the title by which the dataset is known in the public domain. The title should be short, typically not more than 50 characters and should not include terms or jargon that would render it incomprehensible by the general user.

Where there is a possibility of multiple titles then the name by which the dataset is most commonly known should be given and the others provided as Alternative Titles .

Where no title exists that is intelligible to external users, the data producer is encouraged to create a title by which it will be known externally from this point on. Titles should encapsulate the subject, temporal and spatial coverage of the dataset, e.g. Voter Participation in Liverpool Local Elections, 1994. Where a title has an acronym in common use, or where a discipline specific name exists, it should appear as an Alternative Title.

4.2 Alternative Title

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Alternative Title

2

Short name, other name, acronym or alternative language title

O

N

Text

Free text or "Not Applicable"

             

4.2.1 Example

Element Name

Identifier

Entry

     

Alternative Title

2

NRUD

Alternative Title

2

Stratigraphical Lexicon

4.2.2 Comments

This may be used for any additional title(s) under which the dataset may be known, including other language translations. The Alternative Title should be short, typically not more than 50 characters. Where a dataset title has an acronym in common use, or where a discipline specific name exists, it should appear under this element. If there is no alternative title then enter "Not Applicable". More than one instance of this element is allowed. If multiple instances are desired then each instance should have a separate entry as shown in the example.

4.3 Originator

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Originator

3

Person or organisation having primary responsibility for the intellectual content of the data

O

N

Text

Free text or "Not Known"

             

4.3.1 Example

Element Name

Identifier

Entry

     

Originator

3

Heath, A., Nuffield College (Oxford)

Originator

3

Jowell, R., Social and Community Planning Research

Originator

3

Curtice, J.K., University of Strathclyde

Originator

3

Brand, J.A., University of Strathclyde

Originator

3

Mitchell, J.C., University of Strathclyde

4.3.2 Comments

This element should contain information about the person or body with prime responsibility for the intellectual content of the data. This is not necessarily the person or organisation named as contact for the data, or indeed, the person or organisation responsible for publishing the data. The originator(s) are not necessarily those who own the intellectual property rights in the dataset.

More than one instance of this element is allowed. If multiple instances are desired then each instance should have a separate entry as shown in the example.

For many government datasets it will be the originating department that has responsibility. The general rule is that where a work is produced as a result of the policy of a corporate body and at the expense of that corporate body, the corporate body has prime responsibility, not the individual who actually produced it.

4.4 Abstract

Element NameObligation

Maximum Occurrence

Data Type

Domain

Abstract

4

Brief narrative summary of the dataset

M

1

Text

Free text

             

4.4.1 Example

Element Name

Identifier

Entry

     

Abstract

4

Large scale digital topographic mapping, detailed framework mapping for site location, planning and land and asset management.

4.4.2 Comments

This element is used to provide a brief description of the dataset. It should include the content of the dataset and the purpose and/or applications for which the dataset was produced. Whilst there is no maximum length for this description it should be concise and ideally under 100 words in length and not normally longer than 300 words.

4.5 Data Capture Period

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Data Capture Period

5

The period during which data capture took place

O

1

   
             

4.5.1 Example

Element Name

Identifier

Entry

Data Capture Period

5

No data entry required here

4.5.2 Comments

Element 5 acts as a heading to the following 5 elements (elements 6 to 10) and so no metadata entry is required against this element. It is used to show that if data capture period data is available then the following five elements are required.

If the start and/or end dates of capture are not known, not applicable or if data capture is ongoing then this fact cannot be conveyed by using a date data type. This information can be indicated by elements 6 and/or 8.

If the dataset is an amalgamation of a number of datasets with different start and end dates of capture then the overall first and last dates should be given.

4.6 Status of Start Date of Capture

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Status of Start Date of Capture

6

Status of knowledge on Start Date of Capture

O

1

Enumerated List

Known

Not Known

Not Applicable

             

4.6.1 Example

Element Name

Identifier

Entry

Status of Start Date of Capture

6

Known

4.6.2 Comments

It is possible that for some datasets the start date of capture is not known in which case this fact should be indicated here.

In the case of a dataset that has been continuously updated for a number of years and does not contain an historic record, as in the case of the Ordnance Survey Land-Line® dataset, then the term "Not Applicable" may be used.

If the start date is known then the word "Known" should be entered here and the actual date provided in via element 7.

 

 

4.7 Start Date of Capture

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Start Date of Capture

7

Date on which data was first collected

C

1

Date

Date

             

4.7.1 Example

Element Name

Identifier

Entry

Start Date of Capture

 

19850323

4.7.2 Comments

This element is used to specify the date on which data in the set was first collected and is conditional on the entry to element 6 being "Known". This date refers to the actual data capture process and does not relate to the date when the data was first made available to the general public. The date given should be in the format, CCYYMMDD. Where any part of the date is not known it should be replaced with asterisks. For example if data capture started in March 1985, as above, but the day is not known then the date entry should be 198503**.

4.8 Status of End Date of Capture

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Status of End Date of Capture

8

Status of knowledge on End Date of Capture

O

1

Enumerated List

Known

Not known

Not Applicable

Ongoing

             

4.8.1 Example

Element Name

Identifier

Entry

Status of Date of Capture

8

ongoing

4.8.2 Comment

It is possible that for some datasets the end date of capture is not known in which case this fact should be indicated here.

In the case that the dataset is still being updated then "Ongoing" should be entered and the periodicity of update provided via element 10.

If the start date is known then the word "Known" should be entered here and the actual date provided via element 9.

4.9 End Date of Capture

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

End Date of Capture

9

Date on which data was last collected

C

1

Date

Date

             

4.9.1 Example

Element Name

Identifier

Entry

     

End Date of Capture

 

19960128

4.9.2 Comments

If data capture has ended then this element is used to provide the date on which data in the set was last collected and is conditional on the entry to element 8 being "Known". The date given should be in the format, CCYYMMDD. Where any part of the date is not known it should be replaced with asterisks. For example where data capture ended in January 1996, but the day is not known then the date entry should be 199601**.

4.10 Frequency of Update

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Frequency of Update

10

Revision regime of dataset

O

1

Enumerated List

Hourly

Daily

Weekly

Fortnightly

Monthly

Quarterly

Biannually

Annually

Biennially

Triennially

Quinquennially

Decennially

Continuous

Irregular

Never

Not Known

Other

             

4.10.1 Example

Element Name

Identifier

Entry

     

Frequency of Update

10

Quarterly

4.10.2 Comments

If data capture still continues, then the period of update should be given here. Where a dataset is updated continuously and the user would obtain the data in whatever state it happened to be on the date of purchase then enter "Continuous". If the data continues to be updated but there is no regular period of update, enter "Irregular"

4.11 Presentation Type

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Presentation Type

11

Form in which the data resource is available

O

N

Enumerated List

Image

Graphic

Map

Numeric

Text

Other

             

4.11.1 Example

Element Name

Identifier

Entry

     

Presentation Type

11

Numeric

4.11.2 Comments

This element can be used multiple times to specify the forms in which the data is available. The allowable types are:

  • Image:

Photographic or, satellite image

  • Graphic:

drawing

  • Map:

data presented in map form

  • Numeric:

set of numbers or statistical data

  • Text:

free text

More than one instance of this element is allowed. If multiple instances are desired then each instance should have a separate entry as shown in example 4.2.1.

4.12 Access Constraints

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Access Constraint

12

Restrictions and legal prerequisites for accessing the dataset

O

N

Enumerated List

Financial

Legal

Other

Not Known

None

             

 

 

4.12.1 Example

Element Name

Identifier

Entry

     

Access Constraint

12

none

4.12.2 Comments

Use this element to define any restrictions placed on access to the data. As well as financial and legal constraints relating to the data this might be used to specify categories of people who are/are not allowed access to the data. A financial constraint would imply that the data had to be purchased. A legal constraint implies that some security or contractual constraint applies to the dataset.

More than one instance of this element is allowed. If multiple instances are desired then each instance should have a separate entry as shown in example 4.2.1.

4.13 Use Constraints

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Use Constraints

13

Restrictions and legal constraints on using the data

O

1

Text

Free text

Not Known

             

4.13.1 Example

Element Name

Identifier

Entry

     

Use Constraints

13

© Natural Environmental Research Council

4.13.2 Comments

When copyright or other legal restrictions protect the data this element is used to define the terms of the constraint including re-use fees, reproduction restrictions and acknowledgements. The actual fees payable would not be provided here, just the fact that a fee would be payable. If the use constraints are unknown then enter "Not Known".

4.14 Keywords

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Keywords

14

Common-use words or phrases summarising the subject of the dataset

M

1

Text

NGDF list

Free text

             

4.14.1 Example

Element Name

Identifier

Entry

     

Keywords

14

PHOTOGRAPHY AND IMAGERY Satellite panchromatic, 10m resolution

4.14.2 Comments

The entry for this element should take the form of a comma-separated list of words or phrases that describe the subject of the dataset. Comma-separation will enable computer systems to analyse the keyword entry when performing metadata searches.

NGDF has developed a prescribed list of keywords to assist data providers and ensure searchers accurately and consistently find relevant records. As a minimum, one NGDF keywords should be selected, with data providers able to supplement the metadata with other local keywords. The list provides a theme keyword (upper case) as a mandatory requirement with the second order word (lower case) as optional. The NGDF pick-list could also be enhanced or expanded by NGDF as demand dictated especially with the introduction of new themes of metadata.

Keywords should be taken from an identifiable thesaurus of keywords. A recommended on-line thesaurus is the Humanities and Social Science Electronic Thesaurus (HASSET) which is maintained by the Data Archive at the University of Essex. The URL for this service is dawww.essex.ac.uk/services/nhasset.html..

NGDF Keyword

NGDF Modifier (optional)

AGRICULTURE

Crops, Horticulture, Irrigation, Livestock

ATMOSPHERE

Air Quality, Greenhouse, Ozone, Pressure

BOUNDARIES

Administrative, Biophysical, Cultural

CLIMATE AND WEATHER

Climate change, Drought, Extreme weather events, Meteorology, Radiation, Rainfall, Temperature

DEMOGRAPHY

 

DISEASE

 

ECOLOGY

Community, Ecosystem, Habitat, Landscape

ENERGY

Coal, Electricity, Petroleum, Renewable, Use

FAUNA

Exotic, Insects, Invertebrates, Native, Vertebrates

FISHERIES

Aquaculture, Freshwater, Marine, Recreational

FLORA

Exotic, Native

FORESTS

Agriforestry, Natural, Plantation

GEOSCIENCES

Geochemistry, Geology, Geomorphology, Geophysics, Hydrogeology

HAZARDS

Cyclones, Drought, Earthquake, Fire, Flood, Landslip,

Manmade, Pests, Severe, local storms, Tsunamis

HEALTH

 

HUMAN ENVIRONMENT

Economics, Housing, Livability, Planning, Structures and Facilities,

Urban Design

INDUSTRY

Manufacturing, Mining, Other Primary Service

LAND

Cadastre, Cover, Geodesy, Geography, Ownership, Topography, Use, Valuation

MARINE

Biology, Coasts, Estuaries, Geology and Geophysics, Human Impacts, Meteorology, Reefs

MINERALS

 

MOLECULAR BIOLOGY

Genetics

OCEANOGRAPHY

Chemical, Physical

PHOTOGRAPHY AND IMAGERY

Aerial, Remote Sensing, Satellite

POLLUTION

Air, Noise, Soil, Water

SOIL

Biology, Chemistry, Erosion, Physics

TRANSPORTATION

Air, Land, Marine

UTILITIES

 

VEGETATION

Floristic, Structural

WASTE

Greenhouse gas, Heat, Liquid, Sewage, Solid, Toxic

WATER

Groundwater, Hydrochemistry, Hydrology, Lakes, Quality, Rivers, Salinity, Supply, Surface, Wetlands

 

4.15 Geographic Extent

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Geographic extent

15

Geographic area or areas for which data is available

M

1

   
             

4.15.1 Example

Element Name

Identifier

Entry

     

Geographic extent

15

No entry required here

4.15.2 Comments

Element 15 acts as a heading to the following two groups of elements (elements 16 to 22 and elements 23 to 26) and so no metadata entry is required against this element. It is used to show that it is mandatory to provide information that defines the geographic extent covered by the dataset. The method used to define the extent can be in terms of spatial referencing by coordinates e.g. "British National Grid" or by geographic identifiers e.g. "England", "Berkshire" or "SO16". If the spatial referencing is by coordinates then the area or areas bounding the geographic extent described by the data is defined.

Therefore, whilst it is mandatory to define a geographic extent, the type of information provided is conditional on whether spatial referencing by coordinates or geographic identifiers is used. Coordinate data is defined by elements 16 to 22 and geographic identifiers by elements 23 to 26.

If the geographic extent is to be defined by coordinates then this can be achieved by providing National Grid or latitude and longitude values. If the extent covers both Great Britain and Northern Ireland then two sets of National Grid coordinates will be required. That is one set referring to the British National Grid and one referring to the Irish Grid.

Where a dataset contains data relating to a number of geographically localised areas, for example several archaeological sites within a county, then at the Discovery Metadata level, the surrounding enclave i.e. the county should be used to define the extent either using coordinates or geographic identifiers (e.g. administrative areas). Details of the locations of each of the areas would be provided at the detailed metadata level. Where data relates to a number of areas that are widely separated e.g. Cornwall, Yorkshire and Kent then this can be indicated by reference to a number of enclaves or administrative areas. These would require a multiple entry.

It should be noted that the information provided in elements 16 to 22 concerns the definition of the extent of the dataset and not to any coordinate system relating to information within the dataset. Information concerning the contents of the dataset should be provided in element 27

4.16 Spatial Referencing by Coordinates

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Spatial Referencing by Coordinates

16

Geographic extent referenced by use of coordinates

C

1

   
             

 

4.16.1 Example

Element Name

Identifier

Entry

     

Spatial Referencing by Coordinates

16

No entry required here

4.16.2 Comments

This element acts as a sub-heading for the following 5 elements that can be used to define the geographic extent of the dataset by means of coordinates.

4.17 System of Spatial Referencing by Coordinates

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

System of Spatial Referencing by Coordinates

17

Name or description of system or systems of spatial referencing by coordinates used for the geographic extent

M

N

Enumerated List

British National Grid, Irish Grid, Latitude and Longitude

             

4.17.1 Example

Element Name

Identifier

Entry

     

System of Spatial

Referencing by Coordinates

17

British National Grid

4.17.2 Comments

Where coordinates are used to indicate the geographic extent of the data then the name of the system of spatial referencing should be selected from the following list:

More than one instance of this element is allowed. So if the supplier wishes to define the extent by both British National Grid values and by Latitude and Longitude values then elements 16 to 22 should be repeated in two groups.

4.18 Bounding Rectangle

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Bounding Rectangle

18

Rectangle defining an area of occurrence of the dataset

M

N

   
             

4.18.1 Example

Element Name

Identifier

Entry

     

Bounding Rectangle

 

No entry required here

4.18.2 Comments

This element acts as a sub-heading for the following 4 elements that can be used to define a rectangle that bounds the dataset.

If the data supplier wishes to give both Grid values and Latitude and Longitude values then elements 19 to 22 should be repeated for each method of extent definition.

4.19 West Bounding Coordinate

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

West Bounding Coordinate

19

Westernmost coordinate of the limit of the area

M

1

Real

Grid value / longitude

             

4.19.1 Example

Element Name

Identifier

Entry

     

West Bounding Coordinate

 

0

4.19.2 Comments

This coordinate defines the western edge of the bounding rectangle within which the area or one of the areas of the dataset lies. The coordinate should be given either in metric units to the nearest kilometre or as a longitude to the nearest hundredth of a degree (0.01). Longitude values to the West should be expressed as negative numbers and those to the East as positive.

4.20 East Bounding Coordinate

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

East Bounding Coordinate

20

Easternmost coordinate of the limit of the area

M

1

Real

Grid value / longitude

4.20.1 Example

Element Name

Identifier

Entry

     

East Bounding Coordinate

20

700

4.20.2 Comments

This coordinate defines the eastern edge of the bounding rectangle within which the area or one of the areas of the dataset lies. The coordinate should be given either in metric units to the nearest kilometre or as a longitude to the nearest hundredth of a degree. Longitude values to the West should be expressed as negative numbers and those to the East as positive.

4.21 North Bounding Coordinate

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

North Bounding Coordinate

21

Northernmost coordinate of the limit of the area

M

1

Real

Grid value / latitude

             

4.21.1 Example

Element Name

Identifier

Entry

     

North Bounding Coordinate

21

1300

4.21.2 Comments

This coordinate defines the northern edge of the bounding rectangle within which the area or one of the areas of the dataset lies. The coordinate should be given either in metric units to the nearest kilometre or as a latitude to the nearest hundredth of a degree (0.01). Latitude values to the North should be expressed as positive numbers and those to the South as negative.

4.22 South Bounding Coordinate

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

South Bounding Coordinate

22

Southernmost coordinate of the limit of the area

M

1

Real

Grid value / latitude

             

4.22.1 Example

Element Name

Identifier

Entry

     

South Bounding Coordinate

22

0

4.22.2 Comments

This coordinate defines the southern edge of the bounding rectangle within which the area or one of the areas of the dataset lies. The coordinate should be given either in metric units to the nearest kilometre or as a latitude to the nearest hundredth of a degree (0.01).

4.23 Spatial Referencing by Geographic Identifiers

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

Spatial Referencing by Geographic Identifiers

23

Geographic extent referenced by use of geographic identifiers

C

1

   
             

4.23.1 Example

Element Name

Identifier

Entry

     

Spatial Referencing by Geographic Identifiers

23

No entry required here

4.23.2 Comments

This element acts as a sub-heading for the following 3 elements that can be used to define the geographic extent of the dataset by means of geographic identifiers.

4.24 National Extent

Element Name

Identifier

Element Definition

Obligation

Maximum Occurrence

Data Type

Domain

National Extent

24

Geographic extent or occurrence of dataset by parts of the United Kingdom

C